Maps on Steroids

May 17, 2011

Here is an interesting link from the people behind Mapsys.info: “Public Data Visualization with Google Maps and Fusion Tables”.

“Visualizing” public data basically means mapping information that is relevant to a community. A good working example mentioned in the posting is San Francisco’s Bay Area bike accident tracker. The map’s legend decodes the various colored dots as the type of accident and how it came to be recorded.

Source: http://mapsys.info/

A screenshot of the coding needing to display a map with personalized details is offered in the posting. The star of the show is the integration with a fusion table, a tool offered by Google to house data sets to be presented on a map. Added functionality is included by using “SQL-like query syntax” and leveraging “the Python libraries Google provides for query generation and API calls”. This allows you to pick smaller data sets out of the fusion table.

So behind the scenes, this looks like another example of search moving beyond the token keyword. You won’t hear any complaints out of us. I remember creating maps using old fashioned methods when I was working on my engineering degree. This method delivers accuracy and time savings.

Sarah Rogers, May 17, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Database, News, Search, User experience, visualization | Comments Off on Maps on Steroids

Logica Taps Ontology for Large-Scale Data Management Tools

May 6, 2011

“Ontology, Logica Team on Enterprise Data Solutions,” announces Billing & Oss World. In response to client demand in Europe, the deal pairs Ontology’s semantic search product with Logica’s existing penetration in that market.

In the write up, we noted this passage:

Ontology Systems and Logica have formed a strategic relationship to provide Enterprise Data Alignment (EDA) solutions for communication service providers (CSPs) who want to search and align knowledge from the customer, service and network data distributed across their operational, business and infrastructure systems.

In 2006, Ontology saw a niche and set out to fill it. Their semantic search solutions are built to help communication service providers avoid “data misalignment.” In other words, they provide advanced tools that turn a wealth of disorganized data into actionable information.

Logica is a business consulting firm based in the U.K., serving clients around the world in everything from the automotive industry to utility providers. Among other things, they perform Enterprise Content Management, which explains their interest in cutting edge data management tools.

What are the content processing and search tools available to licensees? The write up remains mum. Big data often means big findability problems.

Cynthia Murrell May 6, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Business strategy, Database, News, Technology | Comments Off on Logica Taps Ontology for Large-Scale Data Management Tools

Fetch: Interesting View of Big Data

April 24, 2011

Our sister publication, Inteltrax, covers the world of data fusion, but we thought that Fetch’s stance on big data was appropriate for Beyond Search’s readers.

You may find Fetch Technologies’ Blog entry, “Bringing the Web to Big Data.” In it, Timo Kissel presents a useful point of view on the challenge of big data.

With all the talk about how to simply manage colossal amounts of data, ways to benefit from them can feel like an afterthought. Fetch puts the focus back on how we humans can make best use of Big Data:

But what’s more exciting to me is the use of this Big Data infrastructure to glean novel insights by using new approaches, algorithms, and analytics that simply weren’t feasible before. . . . This is another instance of using computers to do what they’re good at (tireless processing of large amounts of information) and using humans to do what we’re good at – pattern recognition, creativity and insight – albeit now at a scale that would be impossible for us to execute without these novel tools.

Kissel’s example involves retailers. Sure, they can continue to analyze sales from their own stores for trends. However, it would be so much better to open the whole Web, with global information about our products as marketed in different areas by different competitors. Immediately.

It seems that Fetch has some ideas on how to do that with the firm’s services, of course. But whether you go to them or not, this viewpoint represents a profitable way to approach what is now almost every organization’s new hurdle.

Cynthia Murrell April 24, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Business strategy, Database, Enterprise, News, Text processing | 1 Comment

Protected: Is Genesis a New SharePoint Opportunity?

April 21, 2011

Written by Stephen E. Arnold · Filed Under Database, Enterprise, Enterprise search, News, SharePoint, Text processing | Comments Off on Protected: Is Genesis a New SharePoint Opportunity?

Bilingual Search at PaginasAmarillas.com

April 13, 2011

We learned via PRNewswire’s “YaSabe.com to Provide Bilingual Local Search for U.S. Hispanics at PaginasAmarillas.com” that PaginaAmarillas is now bilingual.

With over 50 million Hispanics now living in the U.S., it only makes sense to address that niche. YaSabe and Publicar, S.A. are teaming up to do just that with their bilingual products and services search deal.

Hispanics everywhere can relate to the Paginas Amarillas brand,’ said Carlos Caceres, Internet Business Director at Publicar. ‘Partnering with YaSabe, we will provide a world-class local search experience at PaginasAmarillas.com for the 50 million Hispanics that live in the United States.

Publicar S.A. blankets South and Central America with access to multimedia content, directory assistance, internet search services, and other digital products.

Ya Sabe, Inc. connects U.S. Latinos with resources from local businesses to national brands. Their bilingual search, complete with access to a live human for recommendations, taps into the almost $1 trillion in combined disposable income wielded by Hispanics in this country.

This new service promises to be a welcome tool for Spanish speakers in this country. It is also a smart business move. Check it out at PaginasAmarillas.com.

Cynthia Murrell, April 13, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Business strategy, Database, News, Online (general), Search, Technology, Text processing | 1 Comment

Disappearing US Government Public Information

April 9, 2011

The goose is not going to honk too much about the shuttering of US government Web sites. Most of them get few hits. I know you think that millions of mouse potatoes rush to such thrillers as the US Department of Agriculture’s numerous Web sites or thousands of fact hungry MBAs explore the treasure trove of Department of Commerce content. The reality is that usage is not setting the world on fire.

An outfit called the Sunlight Foundation reported that a bunch of US government Web sites were going dark. The list appeared in “Budget Technopocalypse Deepens: Transparency Sites Will Go Dark In A Few Months.” How do you like that word “technopocalypse”? Felicitous, right?

Anyway, the alleged goners are:

Apps.gov – Better hurry. Bring your credit card.
Data.gov—Some interesting but often incomplete data sets
IT Dashboard – Some spending information. Fascinating for the non economists
Paymentaccuracy.gov – Love the charts
USASpending.gov – Keep in mind the $1.6 trillion deficit and you are good to go.

No further comments from the goose.

Low traffic is the norm for most governmental Web sites. One happy exception for the US government is the IRS Web site at tax time. Traffic drops off after April 15th each year.

Coincident with the removal of sunshine data, the US government will notify me of changes in the terror alert level via Facebook and Twitter. Seems a fair trade I suppose.

Stephen E Arnold, April 9, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Database, Government, News, Online (general), Search | Comments Off on Disappearing US Government Public Information

A SQL Server Keeper: Data Extraction Tool

April 9, 2011

We think Microsoft SQL Server is just about perfect. Well, most of the time. When our favorite database has the hiccups, life can become pretty darned exciting.

“Server Database Extract Tool to Extract SQL Server Database Proficiently” introduces a potentially handy resource to aid in times of trouble. Should you find yourself with a damaged SQL server and thus an inaccessible database, SysTools SQL Recovery is worth a try.

What’s the tool do? Our quick look revealed that we can deals with such corruption issues as:

The file *.mdf is missing and needs to restore
Server can’t find the requested database table
Table corrupt object id wrong
The delete statement conflicted with the reference constraint the conflict occurred in database
The conflict occurred in database msdb table dbo.sysmaintplan_subplans column ‘job_id’
Error 3403 and Severity 22 during recovery initialization.

For compatibility information, check the SysTools SQL Recovery Web site at http://www.sqlserverdatabaserecovery.com/.

The vendor specializes in recovering critical business data from SQL Server 2000, 2005, and 2008. We have successfully recovered SQL data from all servers, operating systems and databases, including relational database servers, web servers (Apache and Microsoft IIS), business application servers, snap servers, NAS, SAN and document management systems, and content management systems.

The pricing model is $129 and $229 for a personal and business license, respectively. There is also a demo version available for download, so it’s worth a look.

Sarah Rogers, April 9, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Database, Microsoft, News, Technology, Text processing | 2 Comments

Recorded Future in the Spotlight: An Interview with Christopher Ahlberg

April 5, 2011

It is big news when In-Q-Tel, the investment arm of the US intelligence community, funds a company. It is really big news when Google funds a company. But when both of these tech-savvy organizations fund a company, Beyond Search has to take notice.

After some floundering around, ArnoldIT was able to secure a one-on-one interview with the founder of Recorded Future. The company is one of the next-generation cloud-centric analytics firms. What sets the company apart technically is, of course, the magnetism that pulled In-Q-Tel and Google to the Boston-based firm.

Mr. Ahlberg, one of the founders of Spotfire which was acquired by the hyper-smart TIBCO organization, has turned his attention to Web content and predictions. Using sophisticated numerical recipes, Recorded Future can make observations about trends. This is not fortune telling, but mathematics talking.

In my interview with Mr. Ahlberg, he said:

We set out to organize unstructured information at very large scale by events and time. A query might return a link to a document that says something like “Hu Jintao will tomorrow land in Paris for talks with Sarkozy” or “Apple will next week hold a product launch event in San Francisco”). We wanted to take this information and make insights available through a stunning user experiences and application programming interfaces. Our idea was that an API would allow others to tap into the richness and potential of Internet content in a new way.

When I probed for an example, he told me:

What we do is to tag information very, very carefully. For example, we add metatags that make explicit when we locate an item of data. We tag when that datum was published. We tag when we analyzed that datum. We also tag when we find it, when it was published, when we analyzed it, and what actual time point (past, present, future) to which the datum refers. The time precision is quite important. Time makes it possible for end users and modelers to deal with this important attribute. At this stage in our technology’s capabilities, we’re not trying to claim that we can beat someone like Reuters or Bloomberg at delivering a piece of news the fastest. But if you’re interested in monitoring, for example, the co-incidence of an insider trade with a product recall we can probably beat most at that.

To read the full text of the interview with Mr. Ahlberg click here. The interview is part of the Search Wizards Speak collection of first person narratives about search and content processing. Available without charge on the ArnoldIT.com Web site, the more than 50 interviews comprise the largest repository of first hand explanations of “findability” available.

If you want your search or content processing company featured in this interview series, write seaky2000 at yahoo dot com.

Stephen E Arnold, April 5, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Business intelligence, Cloud computing, Database, Interview, News, Technology, Text analytics, Text processing | 1 Comment

Big Data, Big Hassles

April 4, 2011

InfoWorld warns, “Big Data runs afoul of big lawyers.” They emphasize that increasingly popular “Big Data” can be inexpensive- until the attorneys get involved.

Big Data has come to refer to large datasets and the tools used to analyze them, a combination which can yield important information if used correctly. It can also be inexpensive.

However, you might want to bring in your council before going too far. The article tells the story of Pete Warden, who:

“. . .Described how he spent just $100 to scrape 500 million Web pages, including 220 million Facebook public profiles, using his own Web crawler and a 100-machine cluster running on Amazon EC2. He was able to analyze the information to match Twitter, LinkedIn, and Facebook accounts with the email accounts of users of his email tool.

“Then, just for fun, he created interactive maps showing how various countries, U.S. states, and cities connect with each other over social media and what types of fan pages they frequent.”

Neat, huh? Facebook didn’t think so. Their legal department cost him over 30 times the money he spent on the adventure.

So, venture forth, but be careful as you explore this new arena.

Cynthia Murrell, April 4, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Analytics, Database, News, Text processing | Comments Off on Big Data, Big Hassles

Resource Links: Text Extraction From HTML Documents

March 28, 2011

We found another nifty links page to add to your software utility file. The list comes from Tomaž Kova?i?’s Tech Blog. He gathered resource links about text extraction from HTML documents to aid the wayward IT worker.

He first highlights articles that cover the basics of text extraction. By reading these articles, you gain a general knowledge about text extraction and the best way to approach it for your needs. He also mentions how to eliminate content “noise” (i.e. content farms).

He’s also collected a comprehensive list of links related to software about text extraction. He says, “There is only a small amount of competition when it comes to software capable of [removing boilerplate text / extracting article text / cleaning web pages / predicting informative content blocks] or whatever terms authors are using to describe the capabilities of their product.”

Extracting text from an HTML document is relatively simple. The type of software you use makes it more complex. He ends with information about APIs and other miscellaneous links that will be helpful. Stash it away for future use.

Whitney Grace, March 28, 2011

Written by Stephen E. Arnold · Filed Under Connectors, Database, News, Text processing | 1 Comment

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Maps on Steroids

Logica Taps Ontology for Large-Scale Data Management Tools

Fetch: Interesting View of Big Data

Protected: Is Genesis a New SharePoint Opportunity?

Bilingual Search at PaginasAmarillas.com

Disappearing US Government Public Information

A SQL Server Keeper: Data Extraction Tool

Recorded Future in the Spotlight: An Interview with Christopher Ahlberg

Big Data, Big Hassles

Resource Links: Text Extraction From HTML Documents

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta