Gleaning Insights and Advantages from Semantic Tagging for Digital Content
September 22, 2016
The article titled Semantic Tagging Can Improve Digital Content Publishing on Aptara Corp. blog reveals the importance of indexing. The article waves the flag of semantic tagging at the publishing industry, which has been pushed into digital content kicking and screaming. The difficulties involved in compatibility across networks, operating systems, and a device are quite a headache. Semantic tagging could help, if only anyone understood what it is. The article enlightens us,
Put simply, semantic markups are used in the behind-the-scene operations. However, their importance cannot be understated; proprietary software is required to create the metadata and assign the appropriate tags, which influence the level of quality experienced when delivering, finding and interacting with the content… There have been many articles that have agreed the concept of intelligent content is best summarized by Ann Rockley’s definition, which is “content that’s structurally rich and semantically categorized and therefore automatically discoverable, reusable, reconfigurable and adaptable.
The application to the publishing industry is obvious when put in terms of increasing searchability. Any student who has used JSTOR knows the frustrations of searching digital content. It is a complicated process that indexing, if administered correctly, will make much easier. The article points out that authors are competing not only with each other, but also with the endless stream of content being created on social media platforms like Facebook and Twitter. Publishers need to take advantage of semantic markups and every other resource at their disposal to even the playing field.
Chelsea Kerwin, September 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
An Early Computer-Assisted Concordance
November 17, 2015
An interesting post at Mashable, “1955: The Univac Bible,” takes us back in time to examine an innovative indexing project. Writer Chris Wild tells us about the preacher who realized that these newfangled “computers” might be able to help with a classically tedious and time-consuming task: compiling a book’s concordance, or alphabetical list of key words, their locations in the text, and the context in which each is used. Specifically, Rev. John Ellison and his team wanted to create the concordance for the recently completed Revised Standard Version of the Bible (also newfangled.) Wild tells us how it was done:
“Five women spent five months transcribing the Bible’s approximately 800,000 words into binary code on magnetic tape. A second set of tapes was produced separately to weed out typing mistakes. It took Univac five hours to compare the two sets and ensure the accuracy of the transcription. The computer then spat out a list of all words, then a narrower list of key words. The biggest challenge was how to teach Univac to gather the right amount of context with each word. Bosgang spent 13 weeks composing the 1,800 instructions necessary to make it work. Once that was done, the concordance was alphabetized, and converted from binary code to readable type, producing a final 2,000-page book. All told, the computer shaved an estimated 23 years off the whole process.”
The article is worth checking out, both for more details on the project and for the historic photos. How much time would that job take now? It is good to remind ourselves that tagging and indexing data has only recently become a task that can be taken for granted.
Cynthia Murrell, November 17, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
RAVN Pipeline Coupled with ElasticSearch to Improve Indexing Capabilities
October 28, 2015
The article on PR Newswire titled RAVN Systems Releases its Enterprise Search Indexing Platform, RAVN Pipeline, to Ingest Enterprise Content Into ElasticSearch unpacks the decision to improve the ElasticSearch platform by supplying the indexing platform of the RAVN Pipeline. RAVN Systems is a UK company with expertise in processing unstructured data founded by consultants and developers. Their stated goal is to discover new lands in the world of information technology. The article states,
“RAVN Pipeline delivers a platform approach to all your Extraction, Transformation and Load (ETL) needs. A wide variety of source repositories including, but not limited to, File systems, e-mail systems, DMS platforms, CRM systems and hosted platforms can be connected while maintaining document level security when indexing the content into Elasticsearch. Also, compressed archives and other complex data types are supported out of the box, with the ability to retain nested hierarchical structures.”
The added indexing ability is very important, especially for users trying to index from from or into cloud-based repositories. Even a single instance of any type of data can be indexed with the Pipeline, which also enriches data during indexing with auto-tagging and classifications. The article also promises that non-specialists (by which I assume they mean people) will be able to use the new systems due to their being GUI driven and intuitive.
Chelsea Kerwin, October 28, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Compound Search Processing Repositioned at ConceptSearching
July 2, 2015
The article titled Metadata Matters; What’s The One Piece of Technology Microsoft Doesn’t Provide On-Premises Or in the Cloud? on ConceptSearching re-introduces Compound Search Processing, ConceptSearching’s main offering. Compound Search Processing is a technology achieved in 2003 that can identify multi-word concepts, and the relationships between words. Compound Search Processing is being repositioned, with Concept Searching apparently chasing Sharepoint Sales. The article states,
“The missing piece of technology that Microsoft and every other vendor doesn’t provide is compound term processing, auto-classification, and taxonomy that can be natively integrated with the Term Store. Take advantage of our technologies and gain business advantages and a quantifiable ROI…
Microsoft is offering free content migration for customers moving to Office 365…If your content is mismanaged, unorganized, has no value now, contains security information, or is an undeclared record, it all gets moved to your brand new shiny Office 365.”
The angle for Concept Searching is metadata and indexing, and they are quick to remind potential customers that “search is driven by metadata.” The offerings of ConceptSearching comes with the promise that it is the only platform that will work with all versions of Sharepoint while delivering their enterprise metadata repository. For more information on the technology, see the new white paper on Compoud Term Processing.
Chelsea Kerwin, July 2, 2014
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
The Dichotomy of SharePoint Migration
May 7, 2015
SharePoint Online gets good reviews, but only from critics and those who are utilizing SharePoint for the first time. Those who are sitting on huge on-premises installations are dreading the move and biding their time. It is definitely an issue stemming from trying to be all things to all people. Search Content Management covers the issue in their article, “Migrating to SharePoint Online is a Tale of Two Realities.”
The article begins:
“Microsoft is paving the way for a future that is all about cloud computing and mobility, but it may have to drag some SharePoint users there kicking and screaming. SharePoint enables document sharing, editing, version control and other collaboration features by creating a central location in which to share and save files. But SharePoint users aren’t ready — or enthused about — migrating to . . . SharePoint Online. According to a Radicati Group survey, only 23% of respondents have deployed SharePoint Online, compared with 77% that have on-premises SharePoint 2013.”
If you need to keep up with how SharePoint Online may affect your organization’s installation, or the best ways to adapt, keep an eye on ArnoldIT.com. Stephen E. Arnold is a longtime leader in search and distills the latest tips, tricks, and news on his dedicated SharePoint feed. SharePoint Online is definitely the future of SharePoint, but it cannot afford to get there at the cost of its past users.
Emily Rae Aldridge, May 7, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

