Open Source: High Value, Low Regard?

July 3, 2014

If you are interested in the utility of open source information, you will want to pay particular attention to the disappearing content triggered by the EU’s right to be forgotten. Information is hard to find if the index has been scrubbed. I thought about the “disappearing” of information when I read “Out of Band.” The write up states:

Crowdsourcing and the wealth of networks are terms that are  in vogue. What the government  generally, and the secret world particularly, refuse to knowledge is that information is a team sport and nature bats last. The government is only as good as its ability to do outreach, and if it relies on lies, nature—reality—will always reveal the truth at some future date.

Interesting point. However, when the most used source of information is filtering information, open source access becomes more important. With a single point of access, the reality becomes what’s findable. Will information access expand. Mr. Steele points out:

For the secret world, only a million-dollar custom-made shim will do, and they won’t notice if the beltway bandit sells them a piece of a beer can claiming it is the custom shim. I cannot overstate the ignorance and inattentiveness of today’s contracting officers and contracting officer technical representatives in the secret world.

In my view, his perspective applies to both commercial indexes and to government information methods. Fascinating. I keep wondering if Google is now the information government.

Stephen E Arnold, July 3, 2014

Microsoft Winds Up Being Cheaper

June 2, 2014

Maybe to the dismay of users, Microsoft winds up being cheaper long term than open source software. When it comes to total cost, Microsoft actually overcomes seemingly cheaper options once all investments in the system are considered. The topic is covered in a popular forum, SlashDot. Visit this thread to read more, “Microsoft Cheaper To Use Than Open Source Software, UK CIO Says.”

The discussion begins:

“Jos Creese, CIO of the Hampshire County Council, told Britain’s ‘Computing’ publication that part of the reason is that most staff are already familiar with Microsoft products and that Microsoft has been flexible and more helpful. ‘Microsoft has been flexible and helpful in the way we apply their products to improve the operation of our frontline services, and this helps to de-risk ongoing cost,’ he told the publication. ‘The point is that the true cost is in the total cost of ownership and exploitation, not just the license cost.’”

So while open source is enticing, it is possible that many organizations enter into open source implementations without considering the cost of customization, security, etc. and all the staffing time that goes with that. And while there may be good reasons to still go your own way with open source, it is best to do the research ahead of time and possibly consult with professionals who can look at the total cost of installation.

Sponsored by ArnoldIT.com, developer of Augmentext

Emily Rae Aldridge, June 02, 2014

For Your Inner SharePoint

May 23, 2014

Short honk: Qink.net offers a useful list of freely available SharePoint libraries. You can find the listing at http://bit.ly/1lGc7tM. There is no major subcategory for “information retrieval.” There is a pointer to Apache’s Lucene.net page. After scanning the list, my thought was that search is not a mainstream focus for these freely available components.

Stephen E Arnold, May 23, 2014

The Hadoop Elephant Offers A Helping Trunk

May 13, 2014

It is time for people to understand that relational databases were not made to handle big data. There is just too much data jogging around in servers and mainframes and the terabytes run circles around relational database frameworks. It is sort of like a smart fox toying with a dim hunter. It is time that more robust and reliable software was used, like Hadoop. GCN says that there are “5 Ways Agencies Can Use Hadoop.”

Hadoop is an open source programming framework that spreads data across server clusters. It is faster and more inexpensive than proprietary software. The federal government is always searching for ways to slash cuts and if they turn to Hadoop they might save a bit in tech costs.

“It is estimated that half the world’s data will be processed by Hadoop within five years.  Hadoop-based solutions are already successfully being used to serve citizens with critical information faster than ever before in areas such as scientific research, law enforcement, defense and intelligence, fraud detection and computer security. This is a step in the right direction, but the framework can be better leveraged.”

The five ways the government can use Hadoop is to store and analyze unstructured and semi-structured data, improve initial discovery and exploration, making all data available for analysis, a staging area for data warehouses and analytic data stores, and it lowers costs for data storage.

So can someone explain why this has not been done yet?

Whitney Grace, May 13, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

MapR Integrates Elasticsearch into Platform

May 7, 2014

Writer Christopher Tozzi opens his Var Guy article, “MapR, Elasticsearch Partner on Open Source Big Data Search,” with a good question: With so many Hadoop distributions out there, what makes one stand out? MapR hopes an integration with Elasticsearch will help them with that. The move brings to MapR, as the companies put it, “a scalable, distributed architecture to quickly perform search and discovery across tremendous amounts of information.” They report that several high-profile clients are already using the integrated platform.

Tozzi concludes with an interesting observation:

“From the channel perspective, the most important part of this story is about the open source Hadoop Big Data world becoming an even more diverse ecosystem where solutions depend on collaboration between a variety of independent parties. Companies such as MapR have been repackaging the core Hadoop code and distributing it in value-added, enterprise-ready form for some time, but Elasticsearch integration into MapR is a sign that Hadoop distributions also need to incorporate other open source Big Data technologies, which they do not build themselves, to maximize usability for the enterprise.”

It will be interesting to see how that need plays out throughout the field. MapR is headquartered in San Jose, California, and was launched in 2009. Formed in 2012, Elasticsearch is based in Amsterdam. Both Hadoop-happy companies maintain offices around the world, and each proudly counts some hefty organizations among their customers.

Cynthia Murrell, May 07, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Watch for Flying Pigs as Microsoft Embraces Open Source

April 29, 2014

Microsoft is getting its open source on. Ars Technica reports, “Microsoft Open Sources a Big Chunk of .NET.” It seems the tech giant is softening its stance on open source resources; perhaps they now see they have little choice if the company wants to remain relevant. Writer Peter Bright reports:

“At its Build developer conference today [April 3, 2014], Microsoft announced that it was open sourcing a wide array of its .NET libraries and related technologies and creating a group, the .NET Foundation, to oversee the development and stewardship of the open source components.

“Perhaps the highlight of the announcement today was that the company will be releasing its Roslyn compiler stack as open source under the Apache 2.0 license. Roslyn includes a C# and Visual Basic.NET compiler, offering what Microsoft calls a ‘compiler as a service.'”

Included in the .NET Foundation are reps from Microsoft (of course), GitHub, and Xamarin. Xamarin and Microsoft have been collaborating for some time, and the former is contributing some if its own libraries to the Foundation. If Xamarin’s experience is any example, Microsoft really is making it easier to collaborate with them. Bright writes:

“We talked to Xamarin CTO Miguel de Icaza about working with Microsoft and the decision to make these components open source. For a long time, he said that while the engineers at the two companies had a good relationship, the decisions that Microsoft made—such as not allowing certain pieces of code to be used on non-Windows platforms—made things difficult for Xamarin.

“However, that changed late last year…. Last November, the companies announced that they were partnering to in order to make it easier to use Xamarin’s tools to write code that works on both Microsoft and non-Microsoft platforms.”

Ah, cooperation! The article specifies that Microsoft has removed troublesome license restrictions, solicited design feedback from Xamarin, published docs under a Creative Commons license, and furnished Xamarin with its internal .NET test suite. Is this a sign of things to come? Stay tuned to see whether Microsoft continues to play well with others.

Cynthia Murrell, April 29, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Open Source Search Gets Confusing

April 3, 2014

Elasticsearch is the favored open source search application and many startups have built their own products on top of the platform, increasing competition among the startups. InfoWorld lets us know that the competition is about to get stiffer in the article, “Logstash Steps Up As Splunk’s Latest Challenger.”

Splunk offers many big data solutions, including security, analytics, application management, and cloud services. The article explains that Logstash is part of a components stack also including Kibaba and Elasticsearch. It is used to log data and can be configured to a user’s needs. It is an Apache-licensed open source endeavor and has a lower cost margin (either free or a different pay for support plans). Elasticsearch has commercialized Logstash through its Marvel product.

It does not appear that Logstash is a direct competitor, but the article explains:

“So far, the biggest distinction between Splunk and its competition is how they’re productized. Splunk’s a proprietary item, but with the emphasis on it being a product and not simply a technology stack. The competition still largely consists of open source stacks rather than actual services, but it’s clear the gap between what Splunk offers at a cost and what others offer for free is closing.”

Another new service pressures Lucid Imagination and other search vendors to create a response, which also makes investors inpatient as Elasticsearch surges forward with bigger and better ideas. Search vendors are lost in the middle as they try to be competitive and earn a profit at the same time. Kudos to Elasticsearch and open source applications.

Whitney Grace, April 03, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

OpenCalais Has Big Profile Users

April 2, 2014

OpenCalais is an open source project that creates rich semantic data by using natural language processing and other analytical methods through a Web service interface. It is a simple explanation for a piece of powerful software. OpenCalais was originally part of ClearForest, but Thomson Reuters acquired the project in 2007. Instead of marketing OpenCalais as proprietary software, Reuters allowed it to remain open. OpenCalais has since become valued metadata open source software that is used on blogs to specialized museum collections.

There are many notables who use OpenCalais and a sample can be found on “The List Of OpenCalais Implementations Grows.”

OpenCalais is excited about the new additions to the list:

“Add 10 to the list of innovative sites and services that use OpenCalais to reduce costs, deliver compelling content experiences and mine the social web for insight. See our press release for more details on each. We are thrilled to recognize the following new sites and services that are changing the way we engage with news and the social Web. They join a growing number of others in media, publishing, blogging, and news aggregation who use OpenCalais.”

Among them are The New Republic, Al Jazeera’s English blogging news networks, Slate Magazine’s blogging network, and I*heart* Sea.” Not only do news Web sites use OpenCalais, but news aggregation apps do as well, including, Feedly. DocumentCloud, and OpenPublish. Expect the list to grow even longer and consider OpenCalais for your own metadata solution.

Whitney Grace, April 02, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Elasticsearch: 70:30 Odds as the Next Big Thing in Search

March 28, 2014

We learned on March 26, 2014  suggesting that the German search vendor Intrafind has been looking for the next big thing. The company may have found it, and we expect that this low profile vendor will be plugging into the Elasticsearch power cable. Wikipedia already has, joining hundreds of other firms looking for a solution to doggy indexing in some other open source centric solutions.

Elasticsearch repackager SearchBlox has rolled out Version 8 of its hosted Elasticsearch system, according to Timo Selvaraj, Co-Founder/VP Product Management of SearchBlox.

As if these two recent developments were not enough, GoveWizely, a Washington, DC engineering services firm, has added Elasticsearch to its arsenal. GovWizely, operated by Erik S. Arnold (yep, that’s my boy) has moved adroitly to capitalize on the surging interest in Elasticsearch’s high performance system.

Contrast Elasticsearch’s rise as the go to open source enterprise search system with the struggles of other open source search vendor and some commercial outfits. LucidWorks has ingested $2 million in venture funding, according to Crunchbase. Elasticsearch has received $34 million in funding. Parity, right?

Not so “fast”. (A gentle nod to the fascinating proprietary system shoe horned by Microsoft into SharePoint.) Elasticsearch seems to be catching up to LucidWorks or winning the critical struggle for developers. Here’s the Elasticsearch pitch:

image

Understated and quiet, according to my engineering team. Could the developments at Intrafind, SearchBlox, and Adhere Solutions, among others, are an early warning system, Elasticsearch certainly could be the “next big thing” in search, enterprise and otherwise.

What’s this mean for the proprietary and non open sourcey vendors like Coveo, Funnelback, Lexmark ISYS, and Hewlett Packard? I would suggest that these firms’ management have to adapt to what appears to an emergent and disruptive force in information processing. If Elasticsearch does emulate the growth of the pre HP Autonomy, the likelihood that the millions of venture funding pumped into search funding and search acquiring may never be repaid. Chilling thought for some stakeholders who may have jumped on the wrong horse and seem compelled to continue to feed the nag fresh, expensive, non recoverable “clover.” (Think millions in hard cash funding with little to show that a payback is imminent or even possible.)

Read more

GitHub Search: Handy for Some Amazon Sportiness

March 24, 2014

GitHub, an open sourcey operation, is in the news again. Navigate to “AWS Urges Developers to Scrub GitHub of Secret Keys.” ITNews reports that some math club members—sorry, open source folks—have “inadvertently exposed their log-in credentials.”

The write up points out that a search of GitHub “for AWS keys returns almost 10,000 results.” The article notes:

GitHub is a community site where developers post their code and allow collaboration from other interested devs. The problem is developers aren’t taking enough care to ensure their credentials are properly protected.

With the management issues at GitHub, perhaps open source evidences some of the fissures in the open source approach to life, business practices, and, of course, search?

Stephen E Arnold, March 24, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta