Topology Is Finally on Top

December 21, 2015

Topology’s time has finally come, according to “The Unreasonable Usefulness of Imagining You Live in a Rubbery World,” shared by 3 Quarks Daily. The engaging article reminds us that the field of topology emphasizes connections over geometric factors like distance and direction. Think of a subway map as compared to a street map; or, as writer Jonathan Kujawa describes:

“Topologists ask a question which at first sounds ridiculous: ‘What can you say about the shape of an object if you have no concern for lengths, angles, areas, or volumes?’ They imagine a world where everything is made of silly putty. You can bend, stretch, and distort objects as much as you like. What is forbidden is cutting and gluing. Otherwise pretty much anything goes.”

Since the beginning, this perspective has been dismissed by many as purely academic. However, today’s era of networks and big data has boosted the field’s usefulness. The article observes:

“A remarkable new application of topology has emerged in the last few years. Gunnar Carlsson is a mathematician at Stanford who uses topology to extract meaningful information from large data sets. He and others invented a new field of mathematics called Topological data analysis. They use the tools of topology to wrangle huge data sets. In addition to the networks mentioned above, Big Data has given us Brobdinagian sized data sets in which, for example, we would like to be able to identify clusters. We might be able to visually identify clusters if the data points depend on only one or two variables so that they can be drawn in two or three dimensions.”

Kujawa goes on to note that one century-old tool of topology, homology, is being used to analyze real-world data, like the ways diabetes patients have responded to a specific medication. See the well-illustrated article for further discussion.

Cynthia Murrell, December 21, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Big data, Data mining, News, Search, Tools | Comments Off on Topology Is Finally on Top

Score One for Yandex

December 21, 2015

Russian search powerhouse Yandex has successfully sued Google, we learn from re/code’s article, “Meet the Russian Company that Got Its Antitrust Watchdog to Bite Google.” Reporter Mark Bergen interviewed Yandex’s Roman Krupenin, who has led this legal campaign. In his intro, Bergen relates:

“In October, Russia’s antitrust authority ruled that Google’s practice of bundling its services on Android handsets violated national law. The case’s lead complainant was Yandex, an 18-year old Web search and advertising company. It’s not a global name, but is big in Russia. Last quarter, Yandex raked in $233.1 million in revenue. (For context, Google averaged about $179 million in sales a day over the same period.) Most Russians use Yandex for Internet searches — an estimated 57 percent in the last quarter, though that share has slipped in recent years. The culprit? According to Yandex, it’s the favored position of Google’s apps, including its search one and its browser, on Android smartphones, which outnumber iPhones in Russia considerably. To fight it off, Yandex has pushed to cut handset agreements of its own: It finalized one with Lenovo last year, and paired with Microsoft last month to make Yandex’s homepage and search results the Russian default for Windows 10.”

Furthermore, we’re reminded, Yandex is also taking part in the EU’s latest antitrust investigation. Naturally, Google is appealing the decision. See the article for text of the interview, where Krupenin discusses the focus on Android over Search, the unique factors that made for victory over the notoriously slippery company, and the call for an end to Google’s service-bundling practices.

Cynthia Murrell, December 21, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Google, Microsoft, Mobile, News, Search, Social Media, Web Services | Comments Off on Score One for Yandex

Watson Is Laying Startup Eggs

December 21, 2015

Incubators are warming stations for eggs. Without having to rely on an organism’s DNA donor, an incubator provides a warm, safe environment for the organism to develop, hatch, and eventually be ready to face the world. Watson has decided it is time for itself to propagate, but instead of knitting tiny computer cases Watson will invest its digital DNA in startups. The Chicago Tribune discusses Watson’s reproduction efforts and progeny in “Watson, IBM’s Big-Data Program Is Also A Startup Incubator.”

While IBM sells Watson’s ability to scan and understand terabytes of data, the company also welcomes developers to use Watson for new ideas. What is even more amazing is that IBM gives developers the ability to use Watson for free for a limited time.

“In Ecosystem, everyone is invited to play with Watson for free (for a limited time); some 77,000 developers have accepted. If your Watson-powered startup shows promise, it becomes a “partner,” often via a quasi-incubator model, and enjoys access to IBM business and technology advisers–and a shot at a capital infusion from the $100 million IBM is making available to Watson startups…”

Ecosystem has been used for startups that feature lifestyle coaching, personal shopping, infrastructure guards, veterinarian advice, fantasy sports calculator, 311 information, and even a hotel butler.

To quote the biblical justification for propagation: “Go forth and multiply the [Watson startups].”

Whitney Grace, December 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Big data, Business strategy, Data, IBM Watson, Marketing, News, Search | Comments Off on Watson Is Laying Startup Eggs

Getting Smart About Cutting the Cable Cord

December 21, 2015

A few years ago, I read an article about someone who was fed up with streaming content because he wanted new shows and access to all the channels so they resubscribed to cable. I have to admit the easiest thing to do would be to pay a monthly cable bill and shell out additional fees for the premiere channels. The only problem is that cable and extra channels are quite expensive. It has since become easier to cut the cord.

One of the biggest problems viewers face is finding specific and new content. Netflix, Hulu, iTunes, and Amazon Prime are limited with licenses and their individual content and having to search each one is time consuming. Even worse is trying to type out a series name using a remote control instead of a keyboard. Technology to the rescue!

The Verge talks about “Yahoo’s New App Is A TV Guide For Cord Cutters” called Yahoo Video Guide that allows viewers to search by a name and instantly watch it.

“Whenever users find what they want to watch, they can click a button to “Stream Now,” and the app will automatically launch a subscription service that hosts the film. If the program isn’t available online, users can buy it, instead.”

The coolest feature is that if viewers want to channel surf all they do so with GIFs. The viewer picks a GIF that fits their mood and the app will sort out content from there.

Finally, all those moving images have a different function than entertaining reddit users.

Whitney Grace, December 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under News, Search, Technology, User experience, Yahoo | Comments Off on Getting Smart About Cutting the Cable Cord

Podcast Search Service

December 18, 2015

I read “Podcasting’s Search Problem Could be Solved by This Spanish Startup.” According to the write up:

Smab’s web app will automatically transcribe podcasts, giving listeners a way to scan and search their content.

What’s the method? I learned from the article:

The company takes audio files and generates text files. If those text files are hosted on Smab’s site, a person can click on a word in the transcript and it will take them directly to that part of the recording, because the transcript and the text are synced. In fact, a second program assesses the audio to determine where sentences begin, making it easier to find chunks of audio. Both functions are uneven, but it’s worth noting here that the company is in a very early stage.

There are three challenges for automatic voice to text to indexing from audio and video sources:

First, there is a great deal of content. The computational cost to covert a large chunk of audio data to a searchable form and then offer a reasonably robust search engine is significant.

Second, selectivity requires an editorial policy. Business and government are likely paying customers, but the topics these folks chase change frequently. The risk is that a paying customer will be disappointed and drop the service. Thus, sustainable revenue may be an issue.

Third, indexing podcasts and YouTube is work that Apple handles rather off handedly and YouTube performs as part of its massive investment in the Google search system. The fact that neither of these firms has pushed forward with more sophisticated search systems suggests that market demand may not be significant.

I hope the Smab service becomes available. Worth watching.

Stephen E Arnold, December 21, 2015

Written by Stephen E. Arnold · Filed Under News, Rich media, Search | Comments Off on Podcast Search Service

Topsy: Good Bye, Gentle Search Engine

December 18, 2015

I used Topsy as a way to search certain social content. No more. The service, she be dead.

The money constrained Apple has shut down the public Topsy search system. “Social Analytics Firm Topsy Shut Down by Apple Two Years After Purchase.”

If you want a recommendation for an alternative, sorry, I don’t have one. There are some solutions that are not free to the general public. The gateways to social media content require money and a bit of effort. If you cannot search content, maybe the content does not exist? That’s a comforting thought unless one knows that the content is available, just not searchable by a person with an Internet connection in a public library, at home, or from the local Apple store.

Stephen E Arnold, December 21, 2015

Written by Stephen E. Arnold · Filed Under News, Search, Social Media | Comments Off on Topsy: Good Bye, Gentle Search Engine

New Patent for a Google PageRank Methodology

December 18, 2015

Google recently acquired a patent for a different approach to page ranking, we learn from “Recalculating PageRank” at SEO by the Sea. Though the patent was just granted, the application was submitted back in 2006. Writer Bill Slawski informs us:

“Under this new patent, Google adds a diversified set of trusted pages to act as seed sites. When calculating rankings for pages. Google would calculate a distance from the seed pages to the pages being ranked. A use of a trusted set of seed sites may sound a little like the TrustRank approach developed by Stanford and Yahoo a few years ago as described in Combating Web Spam with TrustRank (pdf). I don’t know what role, if any, the Yahoo paper had on the development of the approach in this patent application, but there seems to be some similarities. The new patent is: Producing a ranking for pages using distances in a Web-link graph.”

The theory behind trusted pages is that “good pages seldom point to bad ones.” The patent’s inventor, Nissan Hajaj, has been a Google senior engineer since 2004. See the write-up for the text of the patent, or navigate straight to the U.S. Patent and Trademark Office’s entry on the subject.

Cynthia Murrell, December 18, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Data, Google, News, Search, SEO, Web Services, Yahoo | 1 Comment

Old School Mainframes Still Key to Big Data

December 17, 2015

According to ZDNet, “The Ultimate Answer to the Handling of Big Data: The Mainframe.” Believe it or not, a recent survey of 187 IT pros from Syncsort found the mainframe to be the important to their big data strategy. IBM has even created a Hadoop-capable mainframe. Reporter Ken Hess lists some of the survey’s findings:

*More than two-thirds of respondents (69 percent) ranked the use of the mainframe for performing large-scale transaction processing as very important

*More than two-thirds (67.4 percent) of respondents also pointed to integration with other standalone computing platforms such as Linux, UNIX, or Windows as a key strength of mainframe

*While the majority (79 percent) analyze real-time transactional data from the mainframe with a tool that resides directly on the mainframe, respondents are also turning to platforms such as Splunk (11.8 percent), Hadoop (8.6 percent), and Spark (1.6 percent) to supplement their real-time data analysis […]

*82.9 percent and 83.4 percent of respondents cited security and availability as key strengths of the mainframe, respectively

*In a weighted calculation, respondents ranked security and compliance as their top areas to improve over the next 12 months, followed by CPU usage and related costs and meeting Service Level Agreements (SLAs)

*A separate weighted calculation showed that respondents felt their CIOs would rank all of the same areas in their top three to improve

Hess goes on to note that most of us probably utilize mainframes without thinking about it; whenever we pull cash out of an ATM, for example. The mainframe’s security and scalability remain unequaled, he writes, by any other platform or platform cluster yet devised. He links to a couple of resources besides the Syncsort survey that support this position: a white paper from IBM’s Big Data & Analytics Hub and a report from research firm Forrester.

Cynthia Murrell, December 17, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Data, IBM Watson, News, Search, Technology | Comments Off on Old School Mainframes Still Key to Big Data

The Modern Law Firm and Data

December 16, 2015

We thought it was a problem if law enforcement officials did not know how the Internet and Dark Web worked as well as the capabilities of eDiscovery tools, but a law firm that does not know how to work with data-mining tools much less the importance of technology is losing credibility, profit, and evidence for cases. According to Information Week in “Data, Lawyers, And IT: How They’re Connected” the modern law firm needs to be aware of how eDiscovery tools, predictive coding, and data science work and see how they can benefit their cases.

It can be daunting trying to understand how new technology works, especially in a law firm. The article explains how the above tools and more work in four key segments: what role data plays before trial, how it is changing the courtroom, how new tools pave the way for unprecedented approaches to law practice, how data is improving how law firms operate.

Data in pretrial amounts to one word: evidence. People live their lives via their computers and create a digital trail without them realizing it. With a few eDiscovery tools lawyers can assemble all necessary information within hours. Data tools in the courtroom make practicing law seem like a scenario out of a fantasy or science fiction novel. Lawyers are able to immediately pull up information to use as evidence for cross-examination or to validate facts. New eDiscovery tools are also good to use, because it allows lawyers to prepare their arguments based on the judge and jury pool. More data is available on individual cases rather than just big name ones.

“The legal industry has historically been a technology laggard, but it is evolving rapidly to meet the requirements of a data-intensive world.

‘Years ago, document review was done by hand. Metadata didn’t exist. You didn’t know when a document was created, who authored it, or who changed it. eDiscovery and computers have made dealing with massive amounts of data easier,’ said Robb Helt, director of trial technology at Suann Ingle Associates.”

Legal eDiscovery is one of the main branches of big data that has skyrocketed in the past decade. While the examples discussed here are employed by respected law firms, keep in mind that eDiscovery technology is still new. Ambulance chasers and other law firms probably do not have a full IT squad on staff, so when learning about lawyers ask about their eDiscovery capabilities.

Whitney Grace, December 16, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Dark Web, Data mining, Metadata, News, Search | Comments Off on The Modern Law Firm and Data

Google Timeline Knows Where You Have Been

December 16, 2015

We understand that to get the most out of the Internet, we sacrifice a bit of privacy; but do we all understand how far-reaching that sacrifice can be? The Intercept reveals “How Law Enforcement Can Use Google Timeline to Track Your Every Move.” For those who were not aware, Google helpfully stores all the places you (or your devices) have traveled, down to longitude and latitude, in Timeline. Now, with an expansion launched in July 2015, that information goes back years, instead of just six months. Android users must actively turn this feature off to avoid being tracked.

The article cites a report titled “Google Timelines: Location Investigations Involving Android Devices.” Written by a law-enforcement trainer, the report is a tool for investigators. To be fair, the document does give a brief nod to privacy concerns; at the same time, it calls it “unfortunate” that Google allows users to easily delete entries in their Timelines. Reporter Jana Winter writes:

“The 15-page document includes what information its author, an expert in mobile phone investigations, found being stored in his own Timeline: historic location data — extremely specific data — dating back to 2009, the first year he owned a phone with an Android operating system. Those six years of data, he writes, show the kind of information that law enforcement investigators can now obtain from Google….

“The ability of law enforcement to obtain data stored with privacy companies is similar — whether it’s in Dropbox or iCloud. What’s different about Google Timeline, however, is that it potentially allows law enforcement to access a treasure trove of data about someone’s individual movement over the course of years.”

For its part, Google admits they “respond to valid legal requests,” but insists the bar is high; a simple subpoena has never been enough, they insist. That is some comfort, I suppose.

Cynthia Murrell, December 16, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Data, Google, Mobile, News, Search | Comments Off on Google Timeline Knows Where You Have Been

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Topology Is Finally on Top

Score One for Yandex

Watson Is Laying Startup Eggs

Getting Smart About Cutting the Cable Cord

Podcast Search Service

Topsy: Good Bye, Gentle Search Engine

New Patent for a Google PageRank Methodology

Old School Mainframes Still Key to Big Data

The Modern Law Firm and Data

Google Timeline Knows Where You Have Been

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta