How to Create Your Own Oracle Text Index
March 22, 2012
The Swiss-Army Development blog recently released some useful information about key word search with Oracle Text in the post “Keyword Search via Oracle Text.”
The post attempts to create a foundation for using Oracle Text to implement full text search in a table. It takes readers step-by-step through the process of building the back end of an Oracle Text Index and then leveraging that index to include full text search.
The writer states the reasoning behind this project:
“Oracle text is a feature available in the Oracle Database and is used to provide keyword search indexing to large blocks of text and even binary formatted files like Word and PDF files. As part of a project I am working on, I need to create a keyword search index that spans multiple columns. This will allow my users to search for keywords in the title, abstract and content of a note entered into the system. The note could be in the form of an uploaded file, or it could be manually entered through the interface.”
The Swiss wash their cows, useful activity if not germane to milk, cheese, and beef.
Similar to Oracle Text perhaps?
Stephen E. Arnold, March 22, 2012
Navigating SharePoint Trials
March 22, 2012
SharePoint has almost become a corporate mandate, but how can you tell if and how SharePoint will work for your organization? Many companies want to run a trial to see if SharePoint is a good fit. CMSWire covers some SharePoint trial options in, “Office365 or SharePoint Foundation – What’s the Best SharePoint Trial?”
Chris Wright covers some of the major pros and cons:
But the world is slowly changing. The Cloud is now everywhere, and many enterprise applications and services are happily migrating. SharePoint is one of them, included as part of Microsoft’s wider Office365 offering. SharePoint Online, available as a free trial for 30 days, is now another way to investigate the world of SharePoint. However it is important to realize that Online differs from Foundation, and indeed all the locally installed versions, in a number of significant ways. So for the new users wanting to evaluate SharePoint, which is best — SharePoint Foundation or SharePoint Online?
But what happens if a trial is run and SharePoint is not a great fit? The internet is full of content generated by SharePoint developers and users who give practical advice for customizing SharePoint to more effectively meet certain needs. If you determine that your organization cannot invest added time and money in customization, there is another option, third party solutions.
Fabasoft Mindbreeze is a strong enterprise search suite that is already tailored to the needs of end users and developers, cutting out the customization step and allowing more intuitive interfacing with all users.
Fabasoft Mindbreeze is more than a search:
Fabasoft Mindbreeze Enterprise understands you, or to be more precise, understands what the most important information is for you at any precise moment in time. It is the center of excellence for your knowledge and simultaneously your personal assistant for all questions. The information pairing technology brings enterprise and Cloud data together.
Read more about their offerings and see if Fabasoft Mindbreeze might be an asset to your organization.
Emily Rae Aldridge, March 22, 2012
Sponsored by Pandia.com
Another Poobah Insight: Marketing Is an Opportunity
March 21, 2012
Please, read the entire write up “Marketing Is the Next Big Money Sector in Technology.” When you read it, you will want to forget the following factoids:
- Google has been generating significant revenue from online ad services for about a decade
- Facebook is working to monetize with a range of marketing services every single one of the 800 million plus Facebook users
- Start ups in and around marketing are flourishing as the scrub brush search engine optimizers of yore bite the dust. A good example is the list of exhibitors at this conference.
The hook for the story is a quote from an azure chip consultancy. The idea is that as traditional marketing methods flame out, crash, and burn, digital marketing is the future. So the direct mail of the past will become spam email of the future I predict. Imagine.
Marketing will chew up an organization’s information technology budget. The way this works is that since “everyone” will have a mobile device, the digital pitches will know who, what, where, why, and how a prospect thinks, feels, and expects. The revolution is on its way, and there’s no one happier than a Madison Avenue executive who contemplates the riches from the intersection of technology, hapless prospects, and good old fashioned hucksterism. The future looks like a digital PT Barnum I predict.
Optimizing SharePoint 2010 with Powershell
March 21, 2012
For SharePoint developers who need to customize their infrastructure, Powershell can be a powerful appliance to achieve set goals. Craig Pilkenton offers his suggestions for maximizing the tool in, “Using Powershell to… Upload Documents in SharePoint 2010.”
Powershell is the tool for getting things done in all versions of SharePoint (and your servers/desktops too!). It has the capability to automate, monitor, notify, and even ‘react’ to results. Not only does it have custom compiled ‘activities’ to get work done, it has the ability to call any command-line executable or pull in .NET library’s to handle anything not already given to us by Microsoft (or even our own cmdlets). In this article we’ll go over how to use some of the new “manage content” SharePoint cmdlets to interact with the platform just as a user would.
Pilkenton is a Senior Microsoft Consultant at CDW, so his advice is credible and valuable. He even lists a couple of items for consideration before using Powershell, and breaks down the internal structure line by line.
But for those who are not Senior Microsoft Consultants, and even those who may be intimidated by a command line driven program, there are other options. Many third party solutions exist to make customization more intuitive, with less technical skill required.
Fabasoft Mindbreeze offers an entire suite of enterprise solutions to work alongside, or in place of, SharePoint. Read more about their quick, service-oriented, and cost-effective offerings:
The award-winning high-tech product is your personal assistant. 24/7, 365 days a year. Regardless of which data you are looking for and with which system you are working with – Fabasoft Mindbreeze Enterprise answers your questions with pinpoint accuracy.
SharePoint is a powerful and ubiquitous tool in the world of search, but for those who cannot devote the time and attention to such a large infrastructure, a smaller more intuitive solution may be just the fit.
Emily Rae Aldridge, March 21, 2012
Sponsored by Pandia.com
How to Improve SharePoint Information Architecture
March 20, 2012
Over at the SharePoint Pro Blog, Robert Bogue recently posted, “5 Steps to Making SharePoint Information Architecture Work for You.” Information architecture can be an extensive process, creating the structure and tools for your information such that it can be stored, retrieved, and managed in the most efficient way. In the article, Bogue covers some productive steps to take toward creating better information architecture. First on Bogue’s list is to identify attributes:
Identifying the attributes is typically a process of identifying all of the content in your organization that people want to store and find. This might include invoices, purchase orders, time sheets, et cetera. Each of these has a series of attributes such as the invoice date, customer ID, or vendor ID. These may be valuable for organizing the information for retrieval.
Identifying attribute values and creating ranges and groups are also discussed. Of course, you cannot leave out designing a powerful search and navigation experience when discussing an organization’s custom architecture.
This light read is a good introduction to Information Architecture and provides some basic ways to beef up your existing system. But to strengthen your SharePoint system while spending less time on training and configuring add-ons, consider Fabasoft Mindbreeze. Part of the full suite of solutions is the Fabasoft Folio Connector, which provides uniform, reliable management of your digital content. Here is a highlight:
Fabasoft Mindbreeze Enterprise is able to search all data sources connected to the platform simultaneously. In addition to data from, for example, Microsoft Exchange or the file system, the Fabasoft Folio Connector allows to query information objects and documents from Fabasoft Folio, too.
Learn more about connecting your enterprise-wide information assets at Mindbreeze, where they seem to have the benefits of a proper installation down pat.
Philip West, March 20, 2012
Sponsored by Pandia.com
Lexmark: Under Its Own Nose
March 20, 2012
I read “Lexmark Acquires Isys Search Software and Nolij (Knowledge, get it?) In 2008, Hewlett Packard acquired Lexington based Exstream Software. HP paid $350 million for the company, leaving Lexmark wondering what its arch printing enemy was doing. Now more than three years later, Lexmark is lurching through acquisitions.
On March 7, 2012, I reported that Lexmark purchased Brainware, a search, eDiscovery, and back office system. Brainware caught my attention because its finding method was based in part on tri-gram technology. I recall seeing patents on the method which were filed in 1999. I have a special report on this Brainware if anyone is interested. Brainware has a rich history. Its technology stretches back to SER Solutions (See US6772164). SER was once part of SER Systems AG. The current owners bought the search and technology and generated revenue from its back office capabilities, not the “pure” search technology. However, Brainware’s associative memory technology struck me as interesting because it partially addressed the limitations of trigram indexes. Brainware became part of Lexmark’s Perceptive Software unit.
Now, a mere two weeks later, Lexmark snags another search and retrieval company. Isys Search was started by Iain Davies in 1988. Mr. Davies was an author and an independent consultant in IBM mainframe fourth generation languages. His vision was to provide an easy-to-use search system. When I visited with him in 2009, I learned that Isys had more than 12,000 licensees worldwide. However, in the US, Isys never got the revenue traction which Autonomy achieved. Even Endeca which was roughly one-tenth the size of Autonomy was larger than Isys. The company began licensing its connectors to third parties a couple of years ago, and I did not get too many requests for analyses of the company’s technology. Like Endeca, the system processes content and generates a list of entities and other “facets’ which can help a user locate additional information for certain types of queries.
Now Lexmark, which allowed Exstream to go to HP, has purchased two companies with technology which is respectively 24 and 12 years old. I am okay with this approach to obtaining search and retrieval functionality, but I do wonder what Lexmark is going to do to leverage these technologies now that HP has Autonomy and Oracle has Endeca. Microsoft is moving forward with Fast Search and a boat load of third party search solutions from certified Microsoft partners. IBM does the Lucene Watson thing, and every math major from New York to San Francisco is jumping into the big data search and analytics sector.
Here’s a screen shot of the Isys Version 8 interface, which has been updated I have heard. You can see its principal features. I have an analysis of this system as well.
What will Lexmark do with two search vendors?
Here’s the news release lingo:
“Our recent acquisitions enable Lexmark to offer customers a differentiated, integrated system of solutions that are unique, cost effective, and deliver a rapid return on investment,” said Paul Rooke, Lexmark’s chairman and CEO. “The methodical shift in our focus and investments has strengthened our managed print services offerings and added new content and process technologies, positioning Lexmark as a key solutions provider to businesses large and small.”
Perceptive Software is now in the search and content processing business. However, unlike Exstream, these two companies do not have a repository and cross media publishing capability. I think it is unlikely that Lexmark/Perceptive will be able to shoehorn either of these two systems’ technology into its printers. Printers make money because of ink sales, not because of the next generation technology that some companies think will make smart printers more useful. Neither Brainware nor Isys has technology which meshes with the big data and Hadoop craziness now swirling around.
True, Lexmark can invest in both companies, but the cash required to update code from 1988 and methods from 1999 might stretch the Lexmark pocket book. Lexmark has been a dog paddler since the financial crisis of 2008.
Source: Google Finance
Here’s the Lane Report’s take on the deal:
Lexmark’s recent acquisitions have advanced its “capture/manage/access” strategy, enabling the company to intelligently capture content from hardcopy and electronic documents through a range of devices including the company’s award-winning smart multifunction products and mobile devices, while also managing and processing content through its enterprise content management and business process management technologies. These technologies, when combined with Lexmark’s managed print services capabilities, give the company the unique ability to help customers save time and money by managing their printing and imaging infrastructure while providing complementary and high value, end-to-end content and process management solutions.
I have a different view:
First, a more fleet footed Lexmark would have snagged the Exstream company. It was close to home, generating revenue, and packaged a solution. Exstream was not a box of Lego blocks. What Perceptive now has is an assembly job, not a product which can go head to head against Hewlett Packard. Maybe Lexmark will find a new market in Oracle installations, but Lexmark is a printer company, not a data management company.
Second, technology is moving quickly. Neither Brainware nor Isys has the components which allow the company to process content and output the type of results one gets from Digital Reasoning or Palantir. Innovative Ikanow is leagues ahead of both Brainware and Isys.
Neither Brainware nor Isys is open source centric. Based on my research and our forthcoming information services about open source technology, neither Brainware nor Isys is in that game. Because growth is exploding in the open source sector, how will Lexmark recover its modest expenditures for these two companies?
I think there may be more lift in the analytics sector than the search sector, but I live in Harrod’s Creek, not the intellectual capital of Kentucky where Lexmark is located.
Worth watching.
Stephen E Arnold, March 20, 2012
Sponsored by Pandia.com
Inteltrax: Top Stories, March 12 to March 16
March 19, 2012
Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, how some online sources are embracing the big data revolution.
One of the hottest names in online retail in the last 12 months, is breaking analytic ground, according to “Groupon Expands into Big Data,” but is that a good idea?
Another online giant is losing a little analytic ground to its social media competitors, as we learned in “Facebook No Longer Biggest Analytic Source” .
The internet presence of nonprofits has certainly increased and analytics is helping them help more people, according to “Nonprofit Analytics Could Spell Big Business” .
Just like with any industry, the online world has taken a shine to analytics. The results, much like anywhere else tend to be positive, but also involves some shrinking. No matter who’s involved or what direction they’re headed, it’s a fascinating ride and we’ll be monitoring it every day.
Follow the Inteltrax news stream by visiting www.inteltrax.com
Patrick Roland, Editor, Inteltrax.
March 19, 2012
Combine the Cloud and On-Premise Capabilities in SharePoint
March 19, 2012
Over at the SharePoint Pro Blog, Chris McNulty recently posted, “The SharePoint Decision: Do We Choose Cloud or On-Premises?” In the article, the author looks at SharePoint Online versus on-premises SharePoint. SharePoint Online lacks a few features found in an on-premises farm, but SharePoint Online opens new doors that are appealing. McNulty offers a list of questions to ask yourself with respective scores for each answer that may help you decide between the two SharePoint options.
McNulty has this to say:
It’s also important to remember that a cloud vision is almost always a future-looking strategy. Since the cloud is uniformly available, it’s easier to deliver content to users with less respect for their immediate location or device (PCs, tablets, smartphones). Similarly, although Office 365 and SharePoint Online lack features relative to on-premises SharePoint, this isn’t expected to be a permanent situation. If we project forward through the next release of SharePoint, we can forecast a time when the on-premises and cloud versions of SharePoint provide nearly identical functions.
The guidance questions cover a variety of topics, such as the extent of your SharePoint IT team, geographic limitations, and budget outlooks. These questions may be helpful to help you evaluate your system, but we like a simpler solution. Consider a third party solution that combines the best of both worlds. A solution worth a close look is Fabasoft Mindbreeze. One of the key components in their full suite of solutions is the information pairing technology:
Our information pairing technology makes you unbeatable. Information pairing brings enterprise information and information in the Cloud together. This gives you an overall image of a company’s knowledge. This is the basis for your competitive advantage. In this way you can act quickly, reliably, dynamically and profitably in all business matters. Fabasoft Mindbreeze Enterprise and the Cloud fit perfectly together. The Cloud makes you and your business mobile – Mindbreeze makes itself at home in the Cloud.
Read more about the integrated solution that requires no installation at Fabasoft Mindbreeze.
Philip West, March 19, 2012
Sponsored by Pandia.com
Text Analytics Gurus Discuss the State of the Industry
March 19, 2012
Text Analytics News recently reported on an interview with Seth Grimes, the president of Alta Plana Corporation, and Tom H.C Anderson, managing partner of Odin Text- Anderson Analytics, in the article “Infinite Possibilities of Text Analytics.”
According to the article, in preparation for the 8th Annual Text Analytics Summit East in Boston, Text Analytics News reached out to these influential thinkers in the text mining field and asked them some questions regarding the state of the industry.
In response to a question regarding the changes in the approach of analysis software for unstructured data, Grimes said:
The big changes in text analytics are the embrace of and by Big Data, the development of ever-more sophisticated algorithms, and a shift in the way user invoke the technologies. Enterprises understand that a high proportion of Big Data is unstructured: Variety is one of Big Data’s three “Vs.” Text analytics providers know they have to meet challenges presented by the other two “Vs:” Volume and Velocity.
Stephen E Arnold, publisher of Beyond Search, will discuss the implications of “near term, throw forward” algorithms. Mr. Arnold will describe how injections of content can distort the outputs of certain analytic methods. At the fall 2011 conference, Mr. Arnold’s presentation provided a reminder that “objective” outputs may not be.
This is an interesting interview that would be worth checking out for those who are interested in attending the conference or just finding out a little more information about how content is analyzed. For registration information visit the Text Analytics website.
Jasmine Ashton, March 19, 2012
Sponsored by Pandia.com
Monty Program Releases Version of MariaDB
March 18, 2012
Attention, NoSQL fans.
Developers at Monty Program believe they’ve finally got the formula for their MariaDB project on the right track. In the article “MariaDB 5.3.5 Delivers Faster Subqueries” we get a better idea of its functional capabilities.
MariaDB 5.3.5 is the first stable release of the touted maria DB 5.3 relational database series. Developers focused on improved performance (of course) as well as improving querying capabilities and functionality. The developers now feel that the new query optimizer is ready for more widespread production uses.
They have finally made the realm of subqueries using the Maria software usable. Users can utilize semi-join subqueries to run IN subqueries using the join optimizer to select one of five execution strategies. A subquery map shows which queries and optimizations are being utilized in the different versions of the Maria software.
One core optimization, the Table Pullout, can replace sub-queries with a join where appropriate. If the sub-query is not a semi-join, MariaDB 5.3 falls back to other methods including extracting the results of the subquery into a temporary table, or the older IN-TO-EXISTS optimization, the only one to be carried forward to MariaDB 5.3. There is also a subquery cache to reduce the number of times already optimized subqueries are re-executed.
It’s a definite step in the right direction as far as data management is concerned. By being able to map your queries and create subqueries for more relevant material, your able to maximize the potential for your software and production capabilities. Good show.
Stephen E Arnold, March 18, 2012
Sponsored by Pandia.com