Scientific Abbreviations Look Up
December 22, 2010
For some terms abbreviations and acronyms or the long form (LF) Federal Bureau of Investigation and the short form (SF) FBI are immediately recognized.
However, the exact understanding of some abbreviations is not always so cut and paste. According to the Science Base article “Searching For Scientific Abbreviations” there are no clear cut abbreviation rules for the science world. Researchers have sought to lay out specific rules for scientific abbreviations.
One new technique “known as LFXtractor, uses noun chunking together with a distance metric to detect SF- LF pairs regardless of the presence of parenthetical expressions.” However with no exact rules, the verdict is still out on how to use and understand scientific abbreviations.
Google helps when it comes to searching for the abbreviations but surprisingly even the search giant can’t do it all. The specific area or niche must be known because the same abbreviation can mean two different things depending on the subject area. Researchers may have to go back to the drawing board.
April Holmes, December 22, 2010
Freebie
Search Analyst Evaluation
December 21, 2010
“Analytics, Schmanalytics! How to Evaluate an Analyst” reminds us that analysts, unlike lawyers or accountants, are not professionally licensed and gives advice on how to know you are getting your money’s worth. The conclusion: “If you need analytics help, make the effort to assure yourself that the analyst is technically competent, understands your business and has the communications skills that you need.” Some things to look for are formal education or experience in the specific area to be analyzed, familiarity with businesses similar to yours, and a communication style that will suit the situation in which the presentation will be made, whether it be to your boss or in court. Then again, if a search consultant promises to get you sales leads or sales, you might want to ignore this advice all together.
Alice Wasielewski, December 21, 2010
Freebie
Enterprise Search: Baloney Six Ways, like Herring
December 21, 2010
When my team and I discussed my write up about the shift of some vendors from search to business intelligence, quite a bit of discussion ensued.
The idea that a struggling vendor of search—most often an outfit with older technology—“reinvents” itself as a purveyor of business intelligence systems—is common evoked some strong reactions.
One side of the argument was that an established set of methods for indexing unstructured content could be extended. The words used to describe this digital alchemy were Web services, connectors, widgets, and federated content. Now these are or were useful terms. But what happens is that the synthetic nature of English makes it easy to use familiar sounding words in a way to perform an end run around the casual listener’s mental filters. It is just not polite to ask a vendor to define a phrase like business intelligence. The way people react is to nod in a knowing manner and say “for sure” or “I’ve got it.”
Have you taken steps to see through the baloney passed off as enterprise search, business intelligence, and knowledge management?
The other side of the argument was that companies are no longer will to pay big money for key word retrieval. The information challenge requires a rethink of what information is available within and to an organization. Then a system developed to “unlock the nuggets” in that treasure trove is needed. This side of the argument points to the use of systems developed for certain government agencies. The idea is that a person wanting to know which supplier delivers the components with the fewest defects needs an entirely different type of system. I understand this side of the argument. I am not sure that I agree but I have heard this case so often, the USB with the MP3 of the business intelligence sound file just runs.
As we approach 2011, I think a different way to look at the information access options is needed. To that end, I have created a tabular representation of information access. I call the table and its content “The Baloney Scorecard, 2011.”
Webinar Finder from Peelon
December 20, 2010
We’ve recently stumbled upon a promising new resource at Peelon.com. To put it in their own words, Peelon.com “is a webinar directory and can be used as a webinar search engine” AND it is absolutely free of charge, not to mention free of advertising. Peelon vows to do just two things: help find a webinar, or help promote a webinar. After only having investigated the site for minutes, the straightforward, no frills functionality was easily harnessed.
The querying capability is there, allowing the user to sort all available records by date or time, industry and type of webinar. It wouldn’t be surprising to see these initial option categories expanding with increased traffic. But for now, if those options are not sufficient to pinpoint the e-lecture of choice, there is a search box to enter any relevant words or phrase. The results can be filtered by date, comments or even popularity.
Click on any webinar and one will find all the pertinent details spelled out: date, time, description etc. Curiosity led me to check out the “Add new webinar” link which prompted a page of empty webinar details waiting for user input. By the looks of it, the process to post a webinar can’t take longer than five minutes and even that includes one coffee break.
All in all, this site is free of clutter, hassle and just plain free. You won’t hear any complaining here!
Sarah Rogers, December 20, 2010
Freebie
Big Data, CAP, and NoSQL
December 19, 2010
We came across an interesting series of Web write ups about big data. You may know about the CAP theorum. The idea is that is “impossible for a distributed computer system to provide simultaneously” guarantees of “consistency (all nodes see the same data at the same time), availability (node failures do not prevent survivors from continuing to operate), [and] partition Tolerance (the system continues to operate despite arbitrary message loss). For more, read the Wikipedia entry here.
Nati Shalom’s Blog series Part I, II, and III on the CAP theorem postulates that if you are worried about CAP, then maybe you just need to re-define your needs. Shalom’s general thesis is as follows:
One of the core principals behind the CAP theorem is that you must choose two out of the three CAP properties. In many of the transactional systems giving away consistency is either impossible or yields a huge complexity in the design of those systems. In this series of posts, I’ve tried to suggest a different set of tradeoffs in which we could achieve scalability without compromising on consistency. I also argued that rather than choosing only two out of the three CAP properties we could choose various degrees of all three.
Some useful info to tuck away for future reference and consult before talking to a vendor who is pitching scale and the cloud when you need search or content processing.
Alice Wasielewski, December 19, 2010
Freebie
How Americans Spend Their Time
December 18, 2010
Slurp, slurp. ”
That is the sound of “real journalists” gobbling the latest Forrester confection. I read “Forrester: Americans Spend Equal time Online and Watching TV.” Great headline, but I am not sure I know what “time” means. Also, the pairing of online and watching TV is ambiguous.
I get the point. Web activity is now as popular as watching the boob tube. Great.
But what happens to the data if a person watches TV when online?
I think I know what the mid tier outfit is trying to accomplish: make sales for its consulting business. The “data” are the bait for the canny Forrester fishermen and fisherwomen.
Here’s the main idea. People are spending as much time watching TV as the people are fiddling with their computers, which I think means devices that are computers just hauled around or tucked in a pocket.
Several observations:
- What’s the sample size? What was the sampling method? Is the n=xxx such a big deal? Omit that from the stats homework in the lousy liberal college I attended as a dull normal and the prof awarded an automatic F. Guess that doesn’t apply to mid tier consulting outfits.
- Online usage is growing. Okay, great to know since devices have been proliferating for several years. It makes sense that if there are more devices, usage would go up.
- TV sucks. Well, the write up did not document that, but the TV crowd, like the newspaper and other publishers, are in a tizzy as people use their laptops and gizmos like the Apple TV to get the programming each user wants. With control, TV sucks less. If you want only shows you love, TV does not suck at all.
- The features used by those online mirror the same Alexis-Charles-Henri Clérel de Tocqueville “average” that his travels in America documented. The only difference is that the stuff that pleases is pretty well know; for example, email, buying stuff, and socializing.
What’s not in the write up may be in the “real” study available from Forrester? Facebook. My hunch is that the demographics of a statistically-valid sample rigorously surveyed would reveal some nuances not in the article and maybe in the “real” study. Here’s a list:
- In each demographic, which activity is growing more rapidly, which is decreasing more rapidly?
- In the demographic with the heaviest TV usage, what’s the group doing? Using the TV as background, a way to feel loved, or as a primary activity?
- In the demographic with the heaviest online usage, what amount of time is spent on Facebook versus any other social system.
- Across the sample, what is the lean back versus lean forward behavior? How many in each sector use one mode as a primary and the other mode as a secondary?
- Across demographics, who does the most buying? Under what conditions?
Our work in this field suggests some surprising behavioral shifts. The multitasking characteristic is covered in a Forrester blog post. Presumably that activity is documented rigorously in the “real” report.
But what about that sample? What confidence should I have in the oh-so-precise data? Without data about the mechanics of the study, not much I fear.
Stephen E Arnold, December 18, 2010
Freebie unlike the full reports from mid tier consulting firms
US Search Share, November 2010
December 17, 2010
Short honk: Fancy dancing with search share is underway. I read “Bing Search Share Edges Up in November.” In theory, the link will work for a few days. Key point: The Google tallies a 66.2 percent share. The combined Microsoft-Yahoo search share is pretty much the rest of the traffic. No margin of error and no details of the method.
Stephen E Arnold, December 18, 2010
Freebie
Baidu to Invest in Search Results Filtering
December 17, 2010
Those expensive coffees at boutique coffee shops are filtered. People like filtered coffee.
The goose knows why Baidu is probably going to be successful in certain markets. “Baidu to Spend $15 Mln to Screen Search Engine Results” reports that “China’s leading search engine plans to deploy $15 million to expunge illicit material and false information from its search results.” The source? State media.
Filtering makes some things better. One example is revenue derived from for fee service for the China market. Image source: Weekender.com
Observations:
- Filtering happens. What’s interesting is the price tag placed on the renewed effort. Details about the scale of the filtering or “expunging” appear in the Interfax.com write up
- Happy government officials. Reading between the lines I could see modest smiles of happiness on the faces of some government officials.
- Unbeatable advantage in the fastest growing and largest market for online in the world. No further comment necessary because some Google shareholders may ask, “Tell us again why you are not making every effort to maximize shareholder value in the world’s largest market?”
In my opinion, the $15 million is irrelevant. The message is the investment. Message received in Harrod’s Creek. I am not sure about elsewhere.
Stephen E Arnold, December 17, 2010
Freebie
Price Cutting: An Online Mystery
December 17, 2010
One of the mysteries of online is the behavior of users. Individually the actions are idiosyncratic. Put the behaviors of many users together, and you get a completely different insight into what happens in online environments. The usage data don’t falsify online actions. The more data one has, the easier it is to identify what’s hot and what’s not and what’s working and what isn’t.
Traditional media is starting to get with the online program. The chatter about tracking user behavior is one signal of growing awareness of the value of online behavior.
Every once in a while, a story appears in the “real” publishing industry that highlights one of the mysteries of online. To get the information first hand, navigate to ”Amazon Can’t Dent iTunes.” The online version of the story was live as I write this on December 17, 2010. If the link is a 404, you can chase down a hard copy of the December 17, 2010 hard copy newspaper. The main point of the story for me was that Apple’s iTunes has resisted Amazon’s price cutting.
Amazon is like the Energizer bunny, one of the great ad campaigns in my opinion.
Now in a normal business, a “sale” or “close out” will attract shoppers. In retail, lower prices are one of the standard items in the selling tool kit. A local store had a surplus of weird green sweaters. I saw a sign that said, “Sweaters. $10 each.” The shoppers took the bait like a hungry trout in late autumn.
The Wall Street Journal story told me that price cutting in digital music has not worked out too well for Amazon. The Apple iTunes and snazzy hardware ecosystem has kept its grip on music. I am not a music person, so the fascination with digital music is interesting, but of no consequence to me.
Leaks Becoming a River
December 16, 2010
“Openleaks Set to Rival WikiLeaks for Business” announces that one of WikiLeaks’ former employees is opening a new, rival company. In sum: “Openleaks will be a ‘service provider for third parties that want to be able to accept material from anonymous sources’ and will be based in Germany.” The third party aspect makes it distinctive from WikiLeaks since it will be an intermediary and not hosting the information for the public. As these types of sites increase, governments are finding that the ability to gather electronic information is a two-way street: it can gather information on citizens, but citizens also can find ways to gather it themselves. And with the lack of current laws for adequately prosecuting Julian Assange, these kinds of leaks are not likely to be dammed up any time soon.
Alice Wasielewski, December 16, 2010
Freebie