San Diego: Finally Showing Muscle

February 26, 2009

I had heard that San Diego had some issues with its enterprise software and systems. San Diego has some interesting characteristics. There are some interesting high-tech outfits in the city; the giant SAIC comes to mind. San Diego also has lots of McMansions with brutal vacancy rates and a light rail line to Mexico. I read Michael Krigsman’s “San Diego Fires Axon over ERP Implementation Problems” here. I wonder if other governmental agencies will take this type of alleged action. In my experience, there are some interesting cost estimation and contractual practices afoot in some governmental entities. Combine experienced government massagers with overworked and understaffed government agencies, and you get a recipe that spells “cost overruns”. One of the comments that caught my attention was:

Axon has submitted incomplete work product for acceptance and payment.

If accurate, this sentence is a model of clarity. Most government “good bye, dear” memos are more jargon filled. How will San Diego resolve is enterprise software mess? Easy. Hire SAP. I am delighted because I know a follow up story will be possible in a year, maybe less. Wow.

Stephen Arnold, February 27, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general) | Comments Off on San Diego: Finally Showing Muscle

Amazon Data Sets

February 25, 2009

A reader of Google patent documents will be aware of the company’s nudging forward in content delivery. Amazon has an uncanny ability to stay one step ahead of Googzilla. Google might want to bite the bullet and buy Amazon. In my opinion, the wizards headed by the world’s smartest man could end the embarrassment. Amazon has expanded beyond eCommerce. Amazon has crafted a handhold in cloud services. Now Amazon has pushed into content delivery or, more accurately, content access. You can read about the public data sets available via Amazon Web services here. Who cares about these data sets? Well, frankly, not the folks who search for “Perez Hilton”. The key point is that Amazon moves more quickly than Google into spaces that Google’s technical papers and patent documents suggest Google itself finds interesting. A year ago, I held the view that Google was biding its time. Now, I’m not so sure. I think Google is unable to find the right sequence of key presses to unlock some opportunities. Google is now facing three competitors who no longer worry too much about the Mountain View crowd: Amazon, Facebook, and Twitter. As search shifts to real time, Google is beginning to look less agile in my opinion. Add to Google’s inability or unwillingness to respond in this content sector, the YAGG (yet another Google glitch) like the Gmail meltdown and the pfishing issue I noted yesterday, 2009 is shaping up to be quite exciting. Amazon, Facebook, and Twitter are far from perfect. But — count ’em — three companies finding traction where Googzilla’s paws are slipping.

Stephen Arnold, February 25, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Google, News, Online (general), Technology | Comments Off on Amazon Data Sets

Google News: Spiffed Up

February 25, 2009

I may be late to the party, but my Google News Alert has a snappy new look today. I am seeing a cleaner layout, more images, and a tidier layout. I’ve been a fan of Google Alerts for quite a while. Compared to the Yahoo Alerts, Googzilla delivers fewer false drops and is not prone to the trip on slugs and decks in articles which has befuddled Yahooligans for years. Here’s a screen shot of what I saw this evening. If this is old news to you, no problemo. The look is fresh to me.

Stephen Arnold, February 24, 2008

Written by Stephen E. Arnold · Filed Under Google, News, Online (general) | 2 Comments

YAGG: Google Talk

February 24, 2009

Tweets and posts are flying by about an alleged pfishing exploit for Google email. Mashable reports here that another issue may be poking its snout into hapless users’ lives. Adam Ostrow wrote:

Gmail is now being attacked by a phishing scam that is spreading like wildfire.

If true, YAGG strikes again. You get message “check me out” with a link to a tinyurl. Click the puppy and you go to “a site called ViddyHo.” Lucky you. Your contacts get an email. Nifty. Love those tiny urls which mask the destination url.

Stephen Arnold, February 24, 2009

Written by Stephen E. Arnold · Filed Under Cloud computing, Google, News, Online (general) | 1 Comment

Amazon: Outage Reported

February 24, 2009

The old US of A’s computing infrastructure seems to be showing that it ain’t what it used to be. ComputerWorld’s Sumner Lemon wrote here “Amazon Search Engine Suffers Brief Outage.” I have not been too thrilled with some of the features of the A9 system. But my quibbles are minor compared to the search system’s not working. Search is the means by which Amazon generates the bulk of its money. The vaunted cloud services are still modestly sized French fries at the Amazon revenue feast. The system was down for about an hour, but, hey, cloud services are supposed to have rock solid uptime this addled goose thought.

Stephen Arnold, February 24, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Search | Comments Off on Amazon: Outage Reported

Google: Yet Another Google Glitch

February 24, 2009

YAGG (a new acronym pronounced like “gag” as in choke) has been coined by the goslings in the mine drainage pond. The addled goose has little to add to the Washington Post’s headline “Trouble in the Clouds: Gmail Turns into Gfail” here. Xooglers are sending nasty grams to other Googlers, when Gmail and MOMA work obviously. Glitches are becoming very Vista like in the opinion of the addled goose. The reasons offered by main stream dead tree publications omit such interesting causes as:

Googlers are smart, but the size of the company has made the culture susceptible to the Microsoft product management disease
Dependencies within the system are usually trapped by Google’s compile time checks and the peer quality assurance project, but as more Googlers become too busy, little errors can grow up to be big mistakes. Google has not created an Orkut class issue, but the Gmail issue is more immediate
Problems are evident in such unrelated areas as ad metrics, malware flagging, and today mail.

Too bad there is no competitor in a position to challenge the GOOG. A decade of indifference has created a culture of failure among Google’s direct competitors and now a soupçon is evident to the addled goose in some Google functions. Just my honking opinion. I don’t have a fix. The future is evident to me for some Google services. I can see that vista before me. Can you? More pointedly, can you see your Gmail?

Stephen Arnold, February 24, 2009

Written by Stephen E. Arnold · Filed Under Google, News, Online (general) | Comments Off on Google: Yet Another Google Glitch

Twitter Security: An Oxymoron

February 24, 2009

PCWorld’s Joan Goodchild wrote an interesting article about Twitter’s security issues here. She identifies three potential areas of concern. First, a url shortener can send a hapless user to an unknown and potentially harmful location. Second, she identifies a lack of email authentication. And, third, my favorite: Twitter can be useful those who want to “follow” a person. The addled goose is confident that these three issues do not exhaust the security vulnerabilities. The goose does not directly Twitter, send tweets, or fiddle with Twitter ecosystem tools. Those who follow the goose often want to cook it. Could Twitter users get their geese cooked?

Stephen Arnold, February 24, 2009

Written by Stephen E. Arnold · Filed Under Mobile, News, Online (general), Social | Comments Off on Twitter Security: An Oxymoron

Mysteries of Online 8: Duplicates

February 24, 2009

In print, duplicates are the province of scholars and obsessives. In the good old days, I would sit in a library with two books. I would then look at the data in one book and then hunt through the other book until I located the same or similar information. Then I would examine each entry to see if I could find differences. Once I located a major difference such as a number, a quotation, or an argument of some type, I would write down that information on a 5×8 note card. I had a forensics scholarship along with some other cash for guessing accurately on objective tests. To get the forensics grant, I had to participate in cross examination debate, extemporaneous speaking, and just about any other crazy Saturday time waster my “coaches” demanded.

Not surprisingly, mistakes or variances in books, journals, and scholarly publications were not of much concern to some of the students who attended the party school that accepted an addled goose with thick glasses. There were rewards for spending hours looking for information and then chasing down variances. I recall that our debate team, which was reasonably good if you liked goose arguments, were putting up with a team from Dartmouth College. I was listening when I heard a statement that did not match what I had located in a government reference document and in another source. The opponent from Dartmouth had erroneously presented the information. I gave a short rebuttal. I still remember the look of nausea that crossed our opponent’s face when she realized that I presented what I found in my hours of manual checking and reminded the judges that distorting information suggests an issue with the argument. We won.

For most people, the notion of having two individuals with the same source is an example of duplicate information. Upon closer inspection, duplication does not mean identical in gross features. Duplication drills down to the details of the information and to the need to determine which item of information is at variance and then figuring out why and what is the most likely version of the duplicate.

That’s when the fun begins in traditional research. An addled goose can do this type of analysis. Brains are less important than persistence and a toleration for some dull, tedious work. As a result, finding duplicative information and then figuring out variances was not something that the typical college sophomore spends much time doing.

Enter computer systems.

Written by Stephen E. Arnold · Filed Under Database, Enterprise, Feature, Online (general), Text analytics, Text processing | Comments Off on Mysteries of Online 8: Duplicates

Deep Web, Surface Sparkles Occlude Deeper Look

February 23, 2009

You can read pundits, mavens, and wizards comment on the New York Times’s “Exploring a Deep Web that Google Can’t Grasp.” The original is here for a short time. Analysis of varying degrees of usefulness appear in Search Engine Land and the Marketing Pilgrim’s “Discovering the Rest of the Internet Iceberg” here.

There’s not much I can say to reverse the flow of misinformation about what Google is doing because Google doesn’t talk to me or to the pundits, mavens, and wizards who explain the company’s alleged weaknesses. In 2007, I wrote a monograph about Google’s programmable search engine disclosures. Published by BearStearns, this document is no longer available. I included the dataspace research in my Beyond Search study for The Gilbane Group in April 2008. In September, I then with Sue Feldman wrote about Google’s dataspace technology. You can get copy of the dataspace report directly from IDC here. Ask for document 213562. Both of these studies explicate Google’s activities in structured data and how those data mesh with Google’s unstructured information methods. I did a detailed explanation of the programmable search engine inventions in Google Version 2.0. That report is still available, but it costs money and I will be darned if I will restate information that is in a for fee study. There are some brief references to these technologies available at ArnoldIT.com without charge and in the archive to this Web log. You can search the ArnoldIT.com archive at www.arnoldit.com/sitemap.html and this Web log from the search box on any blog page.

This sure looks like “deep Web” information to me. But I am not a maven, wizard, or pundit. Nor do I understand search with the depth of the New York Times, search engine optimization experts, and trophy generation analysts. I read patent documents, an activity that clearly disqualifies me from asserting that Google can’t perform a certain action based on its disclosed in open source disclosures. Life is easier when such disclosures are ignored or excluded from the research process.

So what? Two points:

Google can and does handled structured data. Examples exist in the wild at base.google.com and by entering the query “lga sfo” from Google.com’s search box.
Yip yap about the “deep Web” has been popular for a while, and it is an issue that requires more analysis than assertions based on relatively modest research into the subject

In my opinion, before asserting that Google’s is baffled, off track, clueless, or slow on the trigger–look a bit deeper than the surface sheen on Googzilla’s scales. No wonder outfits are surprised with some of Google’s “effortless” initiatives. By dealing with superficiality, the substance is not seen for what resides under the surface.

Pundits, mavens, wizards, please, take moment to look into Guha, Halevy, and the other Googlers who have thought about and who are working on structured, semistructured, and unstructured data in the Google data environment. That background will provide some context for Google’s apparent sluggishness in this “space”.

Stephen Arnold, February 23, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Database, Google, News, Online (general), Semantic, Technology, Text analytics, Text processing | 2 Comments

Exclusive Interview, Martin Baumgartel, From Library Automation to Search

February 23, 2009

For many years, Martin Baumgartel worked for a unit of T-Mobile. His experience spans traditional information retrieval and next-generation search. Stephen Arnold and Harry Collier interviewed Mr. Baumgartel on February 20, 2009. As one of the featured speakers at the premier search conference this spring, you will be able to hear Mr. Baumgartel’s lecture and meet with him in the networking and post presentation breaks. The Boston Search Engine Meeting attracts the world’s brightest minds and most influential companies to an “all content” program. You can learn more about the conference, the tutorials, and the speakers at the Infonortics Ltd. Web site. Unlike other conferences, the Boston Search Engine Meeting limits attendance in order to facilitate conversations and networking. Register early for this year’s conference.

What’s your background in search?

When I entered the search arena in the 1990s, I originated from library automation. Back then, it was all about indexing algorithms and relevance ranking where I did research to develop a search engine. During eight years at T-Systems, we analyzed the situation in large enterprises in order to provide the right search solution. This included, increasingly, the integration of semantic technologies. Given the present hype about semantic technologies, it has been a focus in current projects to determine which approach or product can deliver in specific search scenarios. A related problem is to identify underlying principles of user-interface-innovations to know what’s going to work (and what’s not).

What are the three major challenges you see in search / content processing in 2009?

Let me come at this in a non technical way. There are plenty of challenges awaiting algorithmic solutions, I see more important challenges here:

Identifying the real objectives, fighting myths For an organization to implement internal search today hasn’t become any easier. There are numerous internal stakeholders, paired with a very high user expectation (they want the same quality as with Internet search, only better, more tailored to their work situation and without advertising…). To keep a sharp analysis becomes difficult in an orchestra of opinions, in particular when familiar brand names get involved (“Let’s just take Google internally, that will do.” )
Avoid simplicity. Although many CIOs claim they have “cleaned up” their intranets, enterprise search remains complex; both technological and in terms of successful management. Therefore, to tackle the problem with a self-proclaimed simple solution (plug in, ready, go) will provide Search. But perhaps not the search solution needed and with hidden costs, especially on the long run. In the other extreme, a design too complex – with the purchase of dozens of connectors – is likely to burst your budget.
Attention. Recently, I heard a lot about how the financial crisis will affect search. In my view, the effects are only reinforcing the challenge “How to draw enough management attention to Search to make sure it’s treated like other core assets”. Some customers might slow down the purchase of some SAP add-on modules or postpone a migration to the next version of Backup Software. But the status of those solutions among CIOs will remain high and un questioned.

With search / content processing decades old, what have been the principal barriers to resolving these challenges in the past?

There’s no unique definition of the ‘Enterprise Search Problem” as if it would be a math theorem. Therefore, you find somehow amorphous definitions about what is to be solved. Let’s take the scope of content to be searched: everything internal? And nothing external? Another obstacle is the widespread believe in shortcuts. Popular example: Let’s just index the content present in our internal content management system, the other content sources are irrelevant. That way, the concept of completeness in search/result set is sacrificed. But search can be as gruesome as the Marathon: you need endurance and there are no shortcuts. If you take a shortcut, you’ve failed.

What is your approach to problem solving in search and content processing?

Smarter software definitely, because the challenges in search (and there are more than three) are attracting programmers and innovators to come up with new solutions. But, in general, my approach is “keep your cool”. Assess the situation, analyze tools and environment, design the solution and explain it clearly. In the process, interfaces have to be improved sometimes in order to trim them down to fit with the corporate intranet design.

With the rapid change in the business climate, how will the increasing financial pressure on information technology affect search / content processing?

We’ll see how far a consolidation process will go. Perhaps we’ll see discontinued search products where we initially didn’t expect it. Also, the relation asked in the following question might be affected: software companies are unlikely to cut back at core features of their product. But integrated search functions are perhaps identified for the scalpel.

Search / content processing systems have been integrated into such diverse functions as business intelligence and customer support. Do you see search / content processing becoming increasingly integrated into enterprise applications?

I’ve seen it the other way around: Customer Support Managers told me (the Search person) that the built-in search-tool is ok but that they would like to look up additional information from some other internal applications. I don’t believe that built-in search will replace stand-alone search. The term “built-in” tells you that the main purpose of the application is something else. No surprise that, for instance, the user interface was designed for this main purpose – and will, in conclusion, not address typical needs of search.

Google has disrupted certain enterprise search markets with its appliance solution. What can a vendor do to adapt to this Google effect?

A vendor should point out where he differs from Google and why to address this Google-effect.

But I see Google as a significant player in enterprise search, if only for the mindset of procurement teams you describe in your question.

As you look forward, what are some new features / issues that you think will become more important in 2009?

The issue of cloudsourcing will gain traction. As a consequence, not only small and medium sized enterprises will discover that they might not invest in in house Content Management and Collaboration applications, but use a hosted service instead. This is when you need more than a “behind the firewall” search, because content will be scattered across multiple clouds (CRM cloud, Office cloud). I’m not sure whether we see a breakthrough there in 36 month; but the sooner the better.

Where can I find more information about your services and research?

http://www.linkedin.com/in/mbaumgartel

Stephen E. Arnold, www.arnoldit.com/sitemap.html and Harry Collier, www.infonortics.com

Written by Stephen E. Arnold · Filed Under Conferences, Enterprise, Interview, News, Online (general), Search, Text analytics, Text processing | Comments Off on Exclusive Interview, Martin Baumgartel, From Library Automation to Search

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

San Diego: Finally Showing Muscle

Amazon Data Sets

Google News: Spiffed Up

YAGG: Google Talk

Amazon: Outage Reported

Google: Yet Another Google Glitch

Twitter Security: An Oxymoron

Mysteries of Online 8: Duplicates

Deep Web, Surface Sparkles Occlude Deeper Look

Exclusive Interview, Martin Baumgartel, From Library Automation to Search

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta