Online Craziness: SkyNet Edition

February 20, 2009

The provenance of the story “Experts Warn of Terminator-Style Military-Robot Rebellion” here appeared on the FoxNews.com Web site, and the story carried a link to the February 19, 2009, Times of London. You will have to read this article and make up your own mind. For me, the most interesting comment was:

The report, the first serious work of its kind on military robot ethics, envisages a fast-approaching era where robots are smart enough to make battlefield decisions that are at present the preserve of humans. Eventually, it notes, robots could come to display significant cognitive advantages over Homo sapiens soldiers.

I quite like the phrase “Homo sapiens soldiers”. Within the last 24 hours, Windows 7 lost its ability to see a network attached storage device. Our Vista test platform froze. And our NetFinity 5500 running Windows Server 2005 died. My BlackBerry would not render www.popurls.com. My two Macs lost the ability to see an external SATA drive with FAT32 partitions. Nope, I won’t be worry too much about a robot taking over the mine drainage pond here in Harrod’s Creek for a while. Those Homo sapiens journalists at FoxNews.com and the Times of London may be in more immediate danger, however. Robots are not likely to kill addled geese since roast goose is not part of the robot diet based on my experience.

Stephen Arnold, February 20, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Technology | 1 Comment

Google Gunk

February 20, 2009

Lee Gomes (now writing for Forbes with a picture too) raised an interesting point about Google’s 411 service here. His story “Google Gives and Takes Away” said:

…Instead of getting the simple vanilla address and phone number I was looking for, my screen results were crowded with Web businesses of dubious utility, offering to help me, say, read reviews of the sought-after doctor written by people I didn’t know and had no particular reason to trust…

The point is that Google gunk gets in the way of the information the user wants. Mr. Gomes did not use the impolite word “gunk”. I do. Ads and clutter may be part of the deterioration of Google that I have started to notice.

Stephen Arnold, February 19, 2009

Written by Stephen E. Arnold · Filed Under Google, News, Online (general) | Comments Off on Google Gunk

Social Networks: Tempest in a Teapot

February 20, 2009

A happy quack to the reader who sent me a link to Conversationblog’s “Enterprise 2.0 Questions – Social Networks a Waste of Time here. The write up points out that “social networks” already exist in organizations. I think I heard that in my high school civics class in 1958, and it is gratifying that organizations consisting of people are social networks. For me, the most interesting comment in the write up was:

it is about setting down guidelines, agree on usage and put the tool – in this case internal social networks – within the company and employee context. Explain how and why the tool can add value – both to the company & the employee (effectiveness, productivity, reduction of search time, creation of virtual teams etc…) and you will probably be positively surprised.

My hunch is that the sticky wicket is the phrase “agree on usage” may become a point of contention in some organizations. Social uses of networks are part of the under 24 year olds’ environment. Before the implosion of one of the largest banks in the US, I watched as MBA wizards used their personal mobile devices and their super secure company laptops. What made this interesting was that I overheard a new hire briefer say that personal communication devices were not to be used in the facility.

Charcoal cooking. Slow, tasty, and best when a cook keeps his / her eye on the veggie burgers.

I think social software has been around for a long time. What’s new is the ubiquity of connectivity. Organizations seem to be getting into hot water. Whether it is the bank in Switzerland coughing up depositer information or the ineptitude of the Securities & Exchange Commission, it is clear that a number of employees, entrepreneurs, and managers do what they want. Social software and ubiquitous real time communications lubricates the flow of information and disinformation. As a result, what control an organization had over its employees is getting harder to exercise.

Charcoal with an accelerant added. Fast, nasty, and no one in his / her right mind wants to watch the consequences.

Consider information. I am amazed at what factoids pop up on Twitter search. More interesting to me is that I can find a reference to a secure information system and then poke around with social systems and unearth specific details of that system. I don’t know if these details are supposed to be floating around like dust motes, but if I were working for a living at a company with Federal contracts, I would feel mightily uncomfortable about the information flowing without control through social communication systems. I know chatter around a water cooler (if these things still exist) is routine, but the speed of information dissemination and the tools available to suck in many factoids raise the stakes in my opinion.

My view is that social software is here. I agree with Conversationblog on this point. I am, however, not convinced that the effects of this information accelerant will be. The difference between charcoal cooking a veggie burger and gun powder is this accelerant angle. Information speed works differently from water cooler chatter. Think explosion. Random explosion at that.

Stephen Arnold, February 20, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Social, Technology | 1 Comment

More Conference Woes

February 19, 2009

The landscape of conferences is like the hillside in the aftermath of Mt. St. Helen’s. Erick Schonfeld has a useful write up about DEMO, a tony conference that seems to be paddling upstream. You can read his article “DEMO Gets Desperate: Shipley Out, Marshall In” here. DEMO is just one of many conferences facing a tough market with an approach that strikes me as expensive and better suited for an economy past. I received an email this morning from a conference organizer who sent me a request to propose a paper. My colleague in Toronto and I proposed a paper based on new work we had done in content management and search. The conference organizer told us that there were too many papers on that type of subject but we were welcome to pay the registration fee and come to hear other speakers. My colleague and I wondered, “First, the organizer asks us to talk, then baits and switches us become paid attendees.” Our reaction was, “Not this time.” Here’s what I received in my email this morning, February 19, 2009:

Due to the current economy, I have decided to extend the Content Management ****/**** North America Conference Valentine’s Day discounted rate to March 2, 2009. This is a $200 discount for all Non-**** members. (**** members can register at anytime at a $300 discounted member rate.) This is meant for those of you needing additional time to get approval to attend the conference. I understand that with the current economy it is becoming harder to obtain funding for educational events. Hopefully by offering this type of discount I will be able to give you the extra support needed to get that final approval. [Emphasis added]

I have masked the specifics of this conference, but I read this with some skepticism.

Valentine’s Day is over. I surmise the traditional conference business is headed in that direction as well.

Telling me via an email that I need additional time to get approval to attend a conference is silly. I own my business. Furthermore, the organizer’s appeal makes me suspicious of not just this conference but others that have been around a long time and offer little in the way of information that exerts a magnetic pull on me.

Conferences that have lost their sizzle are like my mom’s burned roast after a couple of days in the trash can. Not too appealing. What’s the fix? Innovation and creative thinking. Conference organizers who “run the game plan” don’t meet my needs right now. Venture Beat type conferences do.

Stephen Arnold, February 19, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Conferences, News, Online (general) | Comments Off on More Conference Woes

Keeping Internet Transparency Clear

February 19, 2009

Lauren Weinstein of Vortex Technology, http://www.vortex.com/, just announced at http://lauren.vortex.com/archive/000506.html a set of forums he is hosting at http://forums.gctip.org/ called GCTIP–the Global Coalition for Transparent Internet Performance. It’s meant to address Internet transparency, performance and ISP Issues. The project grew out of a network measurement workshop sponsored by “Father of the Internet” Vint Cerf and Google, for which Weinstein assisted in organizing the agenda. Weinstein’s point: It’s impossible to know if we’re getting enough bang for our buck using the Internet unless we have hard facts, so set up measurement tests. My point: Not only is this already being done, but how would you ever get definitive results, and how will an info dump help? Am I oversimplifying? Comments?

Jessica Bratcher, February 19, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Google, News, Online (general), Technology | Comments Off on Keeping Internet Transparency Clear

Mysteries of Online 7: Errors, Quality, and Provenance

February 19, 2009

This installment of “Mysteries of Online” tackles a boring subject that means little or nothing to the entitlement generation. I have recycled information from one of my talks in 1998, but some of the ideas may be relevant today. First, let’s define the terms:

Errors–Something does not work. Information may be wildly inaccurate but the user may not perceive this problem. An error is a browser that crashes, a page that doesn’t render, a Flash that fails. This notion of an error is very important in decision making. A Web site that delivers erroneous information may be perceived as “right” or “good enough”. Pretty exciting consequences result from this notion of an “error” in my experience.
Quality–Content displayed on a Web page is consistent. The regularity of the presentation of information, the handling of company names in a standard way, and the tidy rows and columns with appropriate values becomes “quality” output in an online experience. The notion of errors and quality combine to create a belief among some that if the data come from the computer, then those data are right, accurate, reliable.
Provenance–This is the notion of knowing from where an item came. In the electronic world, I find it difficult to figure out where information originates. The Washington Post reprints a TechCrunch article from a writer who has some nerve ganglia embedded in the companies about which she writes. Is this provenance enough or do we need the equivalent of a PhD from Oxford University and a peer reviewed document. In my experience, few users of online information know or know how to think about the provenance of the information on a Web page or in a search results list. Pay for placement adds spice to provenance in my opinion.

So What?

A gap exists between individuals who want to know whether information is accurate and can be substantiated from multiple sources and those who take what’s on offer. Consider this Web log post. If someone reads it, will that individual poke around to find out about my background, my published work, and what my history is. In my experience, I see a number of comments that say, “Who do you think you are? You are not qualified to comment on X or Y.” I may be an addled goose, but some of the information recycled for this Web log are more accurate than what appears in some high profile publications. A recent example was a journalist’s reporting that Google’s government sales were about $4,000, down from a couple of hundred thousand dollars. The facts were wrong and when I checked back on that story I found that no one pointed out the mistake. A single GB 7007 can hit $250,000 without much effort. It doesn’t take many Google Search Appliance Sales to beat $4,000 a year in revenue from Uncle Sam.

The point is that most users:

Lack the motivation or expertise to find out if an assertion or a fact is correct or incorrect. Instead of becoming a priority, in my opinion, few people care too much about the dull stuff–chasing facts. Even when I chase facts, I can make an error. I try to correct those I can. What makes me nervous are those individuals who don’t care whether information is on target.
See research as a core competency. Research is difficult and a thankless task. Many people tell me that they have no time to do research. I received an email from a person asking me how I could post to this Web log every day. Answer: I have help. Most of those assisting me are very good researchers. Individuals with solid research skills do not depend solely upon the Web indexes. When was the last time your colleague did research among sources other than those identified in a Web index.
Get confused with too many results. Most users look at the first page of search results. Fewer than five percent of online users make use of advanced search functions. Google, based on my research, takes a “good enough” approach to their search results. When Google needs “real” research, the company hires professionals. Why? Good enough is not always good enough. Simplification of search and the finding of information is a habit. Lazy people use Web search because it is easy. Remember: research is difficult.

Written by Stephen E. Arnold · Filed Under Feature, Online (general), Rich media | 1 Comment

Google: Warning Bells Clanging

February 19, 2009

Henry Blodget wrote “Yahoo Search Share Rises Again… And Google Falls” here. The hook for the story is a report from the comScore tracking data that shows Google’s share of the Web search market “dropped a half point to 63%.” Mr. Blodget added quite correctly, “You don’t see that every day.” Mr. Blodget also flags Yahoo’s increase in search share, which jumped to 21%. Yahoo has made gains in share for the last five months. Congratulations to Yahoo.

Several comments:

Data about Web search share is often questionable.
Think back to your first day in statistics. Remember margin of error? When you have questionable data, a narrow gain or loss, and a data gathering system which is based on some pretty interesting data collection methods–what do you get? You get Jello data.
The actual Web log data for outfits like Google and Yahoo often tell the company employees a different story. How different? I was lucky enough last year to see some data that revealed Google’s share of the Web search market north of 80 percent in the US. So which data are correct? The point is that sampled data about Web search usage is wide of the actual data by 10 to 15 percent or more.

Is Google in trouble? Not as much trouble as Yahoo. Assume the data are correct. The spread between Yahoo and Google is about 40 percent. Alarmism boosts traffic more easily than Yahoo can boost its share of the Web search market in my opinion.

Stephen Arnold, February 18, 2009

Written by Stephen E. Arnold · Filed Under Google, News, Online (general), Search, Yahoo | Comments Off on Google: Warning Bells Clanging

Alacra Raises Its Pulse

February 19, 2009

Alacra Inc., http://www.alacra.com/, released their Pulse Platform today. Along the lines of Beyond Search’s own Overflight service, Alacra’s Pulse finds, filters and packages web-based content by combining semantic analysis and an existing knowledge base to target and analyze more than 2,000 hand-selected feeds. The first platform app is Street Pulse, which carves out “what do key opinion leaders have to say…” about a given company. A press release says it “integrates comments from sell-side, credit analysts, industry analysts and a carefully vetted list of influential bloggers.” It’s offered free at http://pulse.alacra.com. There’s also a clean, licensed professional version with bells and whistles like e-mail alerts. More apps will follow that mayfocus on hush-hush company droppings that everyone loves to muck through. Alacra’s jingle is “Aggregate, Integrate, Package, Deliver.” Since so many people are already wading through info dumps, we see this service growing into a critical search resource. We’re certainly on board the bandwagon and will be watching for further developments.

Jessica W. Bratcher, February 19, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general), Text processing | Comments Off on Alacra Raises Its Pulse

Semantics in Firefox

February 19, 2009

Now available: the wonders of semantic search, plugged right into your Mozilla Firefox browser. headup started in closed testing but is now a public beta model downloadable from http://www.headup.com or http://addons.mozilla.org. You do have to register for it because Firefox lists it as “experimental,” but the reviews at https://addons.mozilla.org/en-US/firefox/reviews/display/10359 are glowing. A product of SemantiNet this plugin is touted to enable “true semantic capabilities” for the first time within any Web page. headup’s engine extracts customized info based on its user and roots out additional data of interest from across the Web, including social media sites like Facebook and Twitter. Looks like this add-on is a step in the right direction to bringing the Semantic Web down to earth. Check it out and let us know what you think.

Jessica Bratcher, February 19, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Search, Semantic, Text processing | Comments Off on Semantics in Firefox

Mahalo: SEO Excitement

February 18, 2009

If a Web site is not in Google, the Web site does not exist. I first said this in 2004 in a client briefing before my monograph The Google Legacy was published by Infonortics Ltd. The trophy MBAs laughed and gave me the Ivy draped dismissal that sets some Wall Street wizards (now former wizards) apart. The reality then was that other online indexing services were looking at what Google favored and emulating Google’s sparse comments about how it determined a Web site’s score. I had tracked down some of the components of the PageRank algorithm from various open source documents, and I was explaining the “Google method” as my research revealed it. I had a partial picture, but it was clear that Google had cracked the problem of making the first six or seven hits in a result list useful to a large number of people using the Internet. My example was the query “spears”. Did you get Macedonian spears or links to aboriginal weapons? Nope. Google delivered the pop sensation Britney Spears. Meant zero to me, but with Google’s surging share of the Web search market at that time, Google had hit a solid triple.

The SEO (search engine optimization) carpetbaggers sensed a pot of gold at the end of desperate Web site owners’ ignorance. SEO provides advice and some technical services to boost a Web page’s or a Web’s site appeal to the Google PageRank method. Over the intervening four or five years, a big business has exploded to help a clueless marketing manager justify the money pumped into a Web site. Most Web sites get minimal traffic. Violate one of Google’s precepts, and the Web site can disappear from the first page or two of Google results. Do something really crazy like BMW’s spamming or the Ricoh’s trickery and Googzilla removes the offenders from the Google index. In effect, the Web site disappears. This was bad a couple of years ago, but today, it is the kiss of death.

I received a call from a software company that played fast and loose with SEO. The Web site disappeared into the depths of the Google result list for my test queries. The aggrieved vice president (confident of his expertise in indexing and content) wanted to know how to get back in the Google index and then to the top of the results. My answer then is the same as it is today, “Follow the Google Webmaster guidelines and create compelling content that is frequently updated.”

Bummer.

I was fascinated with “Mahalo Caught Spamming Google with PageRank Funneling Link Scheme” here. The focal point of the story is that Mahalo, a company founded by Jason Calacanis, former journalist, allegedly “was caught ranking pages without any original content-in clear violation of Google’s guidelines.” The article continued:

And now he has taken his spam strategy one step further, by creating a widget that bloggers can embed on their blogs.

You can read the Web log post and explore the links. You can try to use the referenced widget. Have at it. Furthermore,I don’t know if this assertion is 100 percent accurate. In fact, I am not sure I care. I see this type of activity in reality or as a thought experiment as reprehensible. Here’s why:

This gaming of Google and other Web indexing systems costs the indexing copies money. Engineers have to react to the tricks of the SEO carpetbaggers. The SEO carpetbaggers then try to find another way to fool the Web indexing system’s relevance ranking method. A thermonuclear war ensues and the costs of this improper behavior sucks money from other needed engineering activities.
The notion that a Web site will generate traffic and pay for itself is a fantasy. It was crazy in 1993 when Chris Kitze and I started work on The Point (Top 5% of the Internet), which is quite similar to some of the Mahalo elements. There was no way to trick Lycos or Harvest because it was a verifiable miracle if those systems could update their indexes and handle queries with an increasing load and what is now old-fashioned, inefficient plumbing. Somehow a restaurant in Louisville Kentucky or a custom boat builder in Arizona thinks a Web site will automatically appear when a user types “catering” or “custom boat” in a Google search box. Most sites get minimal traffic and some may be indexed on a cycle ranging from several days to months. Furthermore, some sites are set up in such a wacky way that the indexing systems may not try to index the full site. The problem is not SEO; the problem is a lack of information about what’s involved in crafting a site that works.
Content on most Web sites is not very good. I look at my ArnoldIT.com Web site and see a dumping ground for old stuff. We index the content using the Blossom search system so I can find something I wrote in 1990, but I would be stunned if I ran a query for “online database” and saw a link to one of my essays. We digitized some of the older stuff, but no one–I repeat–no one looks at the old content. The action goes to the fresh content on the Web log. The “traditional” Web site is a loser except for archival and historical uses.

The fact that a company like Mahalo allegedly gamed Google is not the issue. The culture of cheating and the cult of SEO carpetbaggers makes this type of behavior acceptable. I get snippy little notes from those who bilk money from companies who want to make use of online but don’t know the recipe. The SEO carpetbaggers sell catnip. What these companies need is boring, dull, and substantial intellectual protein.

Google, Microsoft, and Yahoo are somewhat guilty. These companies need content to index. The SEO craziness is a cost of doing business. If a Web site gets some traffic when new, that’s by design. Over time, the Web site will drift down. If the trophy generation Webmaster doesn’t know about content and freshness, the Web indexing companies will sell traffic.

There is no fix. The system is broken. The SEO crowds pay big money to learn how to trick Google and other Web indexing companies. Then the Web indexing companies sell traffic when Web sites don’t appear in a Google results list.

So what’s the fix? Here are some suggestions:

A Web site is a software program. Like any software, a plan, a design, and a method are needed. This takes work, which is reprehensible to some. Nevertheless, most of the broken Web sites cannot be cosmeticized. Some content management systems generate broken Web sites as seen by a Web indexing system. Fix: when possible, start over and do the fundamentals.
Content has a short half life. Here’s what this means. If you post a story once a month, your Web site will be essentially invisible even if you are a Fortune 50 company. Exceptions occur when an obscure Web site breaks a story that is picked up and expanded by many other Web sites. Fix: write compelling content daily or better yet more frequently.
Indexing has to be useful to humans and content processing systems. Stuffing meaningless words into a metatag is silly and counterproductive. Hiding text by tinting it to be the same as a page’s background color is dumb. Fix: find a librarian or better yet take a class in indexing. Select meaningful terms that describe the content or the page accurately. The more specialized your terminology, the more narrow the lens. The broader the term, the wider the lens. Broad terms like “financial services” are almost useless, since the bound phrase is devalued. Try some queries looking for a financial services firm in a mid sized city. Tough to do unless you get a hit in http://local.google.com or just look up the company in a local business publication or ask a friend.

As for Mahalo, who cares? The notion of user generated links by a subject matter expert worked in 1993. The method has been replaced by http://search.twitter.com or asking a friend on Facebook.com. Desperate measures are needed when traffic goes nowhere. Just don’t get caught is the catchphrase in my opinion.

Stephen Arnold, February 18, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Google, News, Online (general), Search | Comments Off on Mahalo: SEO Excitement

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Online Craziness: SkyNet Edition

Google Gunk

Social Networks: Tempest in a Teapot

More Conference Woes

Keeping Internet Transparency Clear

Mysteries of Online 7: Errors, Quality, and Provenance

Google: Warning Bells Clanging

Alacra Raises Its Pulse

Semantics in Firefox

Mahalo: SEO Excitement

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta