Text Analytics Forum 2017

Keynotes

Keynote - KM Buy-In: Proven Practices

Thursday, November 9: 8:45 a.m. - 9:45 a.m.

For a KM initiative to be successful, knowledge managers must secure the support of senior leaders before implementation. Early top management buy-in results in funding, resources, advocacy, usage, broad organizational support, and success— the program yields its expected benefits, KM is spoken of and written about positively by leaders, stakeholders, and users. Hear from our long-time KM practitioner about proven practices illustrated by real-world examples for securing resources, active participation, and ongoing advocacy from top leadership. Get lots of tips for leading an effective, sustainable KM program that is seen as essential to the success of companies in different industries, of different sizes, and with different cultures.

Speaker:

Stan Garfield, Author of six KM books & Founder, SIKM Leaders Community

Keynote - Beyond the Box: How Search is Driving Data Access in a Hybrid World

Thursday, November 9: 9:45 a.m. - 10:00 a.m.

For more than a decade, search technology has been used as the primary access point to the mountains of knowledge and data sitting behind an organization’s firewall. As environments evolve to account for private and public clouds, search is evolving beyond just the box to an API for human information. Will Hayes explores that evolution and talks about how search technologies and professionals play a key role in the enterprise cloud migration strategy.

Speaker:

Will Hayes, CEO, Lucidworks

Luncheon Keynote - Cognitive Search & Analytics: What It Is & Why You Should Care

Thursday, November 9: 12:00 p.m. - 1:00 p.m.

If you are a believer in the data-driven organization (or even just curious) and have ever wondered what could happen if you cleverly combined the power of data collection, indexing, text mining, search, and machine learning into a unified platform and applied it within the enterprise, this talk is for you! Come learn about the state of cognitive search and analytics technology and how it is enabling great companies across a wide swath of industries to amplify mission-critical expertise within their business in a surprisingly short amount of time. Our speaker illustrates the technology in action with real-world examples.

Speaker:

Scott Parker, Director of Product Marketing, Sinequa

Track 1: Technical

Machine Learning, Taxonomy, Search

Thursday, November 9: 10:15 a.m. - 11:00 a.m.

Combining Machine Learning, Text Analytics, & Semantic Web for Automated Tagging

This talk describes work that the IBM Taxonomy Squad has done to develop an enterprise-scale service that automates the extraction of entities and the generation of meaningful metadata. We cover the approach that was taken to design a solution architecture that leverages a corporate knowledgebase and integrates best-of-breed services in taxonomy and ontology management, NLP, machine learning, text annotation, and entity extraction.

Speaker:

Dan Segal, Information Architect, IBM

The Savior Machine: Text Analytics, Machine Learning, & the Role of Taxonomy

The promise of machine learning has become a practical reality in today’s enterprise, but companies often struggle with implementation or reliable results. One fundamental issue is the common “garbage in, garbage out” problem. Poor input stems from the lack of clean data or unclear results from unstructured data analysis feeding machine learning models. Well-built taxonomies powering clear text analytics rules are an important infrastructure need often overlooked in data science activities. Come learn more about the role of taxonomy and text analytics as sources of clean data for machine learning.

Speaker:

Ahren Lehnert, Principal Taxonomist, Nike Inc., USA

Machine Learning VS. Rules

Thursday, November 9: 11:15 a.m. - 12:00 p.m.

Automatic Classification: Rules-Based vs. Training-Set-Based Bakeoff

Machine learning techniques can be used effectively for a wide variety of text analysis scenarios, such as reputation monitoring on social media, fraud detection, patent analysis, and e-Discovery. But to apply them well, you need to understand where the limits and pitfalls are in the technology, and you need to understand your data and the problem you are trying to solve. This session outlines an approach that uses text analytics to help understand the characteristics of your data, followed by selection and tuning of linguistic and statistical processing and machine learning parameters to address the application at hand. We highlight three real-world projects that used this approach and show how they worked, what went right and wrong, and how they evolved over time.

Speaker:

Jeff Fried, Director, Platform Strategy & Innovation, InterSystems

Text Analytics & Machine Learning

Government agencies face tremendous challenges daily. This includes providing services to ensure a safe, livable environment; making informed spending decisions; and regulating a healthy economy. The data that supports these missions is exploding and is increasingly unstructured. This presentation discusses the application of text analytics and visualizations across a number of these datasets and respective initiatives to provide an actionable view into the data. This involves demystifying techniques including predictive modeling and machine learning in this domain. We show how these techniques can be applied to research analytics, government spending, situational awareness, and assessing consumer financial complaints.

Speaker:

Tom Sabo, Advisory Solutions Architect, SAS

Auto-Categorization & Summarization

Thursday, November 9: 1:00 p.m. - 1:45 p.m.

Auto Categorization by Taxonomy: Pros, Cons, & Pragmatics

The terminologies that form taxonomies, thesauri, classification schemes, and name authorities aim to define all concepts unambiguously. These conceptual definitions are, however, primarily written for a human audience and are only partially meaningful to automated categorization processes. This talk explores how automated categorization rules can be synthetically generated by mining the terminology and semantic relationships found in traditional knowledge organization systems. We examine the pros, cons, and limitations of using categorization rules derived from KOS and discuss how they can then be refined and extended using human-curated categorization rules.

Speaker:

Dave Clarke, EVP, Semantic Graph Technology, Synaptica - a Squirro Company, UK

Search, Semantic Analysis, Text Mining

This talk presents an original approach to processing search results. Rather than showing the usual 10 blue links to webpages, the software creates a text summary of those webpages—a narrative on the topic of the user’s query. The narrative gives the user a quick way to understand the key information on his query. This approach is best applicable to queries that are informational in nature, i.e., those where the user wants to understand a particular subject and get a quick grasp of a concept, an event, a product, or a public figure. The talk focuses on the merits and drawbacks of the approach and comparison with other techniques of presenting the answer to the user’s query.

Speaker:

Dmitri Soubbotin, Founder & CEO, Semantic Engines LLC

Text & Data Together

Thursday, November 9: 2:00 p.m. - 2:45 p.m.

Extracting Content for Linked Data Triples

There is much talk about building triple stores from source content, but most of the models are just that without content to back them up. This session covers a case study of building a triple store to support search and other use cases from nearly 6 million documents. It also looks at the extraction or mining process for pulling 22 types of triple sets for full text and redeploying them for search queries. Lessons learned are also covered.

Speaker:

Marjorie M.K. Hlava, President & Chairman, Access Innovations, Inc. and Data Harmony My blog is TaxoDiary.com

Text Analytics in the Context of Semantic Datasets & Ontologies

This session addresses the main principles of extracting entities and relationships from unstructured content against ontologies and semantic data sets. We give industry examples of business cases and key components of semantic technology architectures including text analytics and supporting data and metadata governance workflows. Finally, we demonstrate semantic annotation, talking about the challenges organizations face in this regard and some of the important lessons learned in more than 15 years of industry experience

Speaker:

Borislav Popov, Text Analytics and Annotation, Ontotext AD

Measuring The Results

Thursday, November 9: 3:00 p.m. - 3:45 p.m.

Measuring Auto-Categorization Quality

Auto-categorization is “auto” only in part—there is much in the process that still requires old-fashioned human judgment. One critical step on the human side of the fence is to evaluate the quality of results so refinements can be developed and fed back into the process. But how do you measure quality when human indexers themselves apply topics inconsistently and often differ over applicability of topics? This case study explains how one publisher approached quality assessment in light of human variability and details how classic recall and precision measures were adjusted to provide a user-focused sense of auto-categorization quality.

Speaker:

Larry Lempert, Director, Product Research and Planning, Bloomberg BNA

Information Retrieval Performance Measurement Using Extrapolated Precision

In search, there is often a trade-off between recall and precision, and this impacts any evaluation of approaches: If one system achieves higher recall but lower precision, is it better? Traditionally, this situation has been addressed by using a measure that combines precision and recall into a single number, such as the F1 score. F1 makes strong assumptions about the amount of precision you can trade for a little more recall, and those assumptions are not always appropriate. In some contexts, recall and precision have very different significance. This talk presents a novel performance measure called the extrapolated precision, which avoids making such strong assumptions about allowed trade-offs between precision and recall.

Speaker:

Bill Dimm, Founder & CEO, Hot Neuron LLC

Track 2: Business & Applications

Fake News & Bad Ad Placement

Thursday, November 9: 10:15 a.m. - 11:00 a.m.

News Analytics System

For modern digital enterprises, the key to survival is held by real-time predictive analytics done with heterogeneous data gathered from multiple sources—layered with contextual intelligence. The data is a mix of structured and unstructured data. Establishing contextual relevance requires systems imbued with deep reasoning capabilities that can link relevant pieces of information from within and outside the organization. This talk presents the outlines of a framework that can gather news events in real time, classify them, reason with them, and finally link them to an enterprise information repository and thereby generate alerts or early warnings for subscribed users. The framework is presented through a number of case studies.

Speaker:

Lipika Dey, Principal Scientist, Innovation Labs, Tata Consultancy Services

Content Meets Interest - Contextual Ad Targeting by Means of Cognitive Computing

The globally increasing tendency for political populism and media criticism has raised the sensitivity of brands to avoid misplacement of their own campaigns in negative and compromising contexts (bad ads). However, ad targeting is predominantly based on behavioral targeting techniques that heavily rely on (cookie-based) user profiling. The talk showcases a solution for real-time contextual targeting that is exploiting the full power of cognitive computing to match campaigns to online users’ real interests. The approach abandons tracking of any kind of user data and at the same time increases the precision of ad targeting on a real semantic level—beyond what can be achieved with keyword-based methods.

Speaker:

Heiko Beier, CEO, MORESOPHY

Case Studies II—Banks & Publishing

Thursday, November 9: 11:15 a.m. - 12:00 p.m.

Text Analytics & KM

The Inter-American Development Bank (IDB) is the main source of multilateral financing for Latin-America and the Caribbean, and in addition to finance also provides knowledge responses to the Region’s development challenges. In this context, the IDB is constantly working to leverage new technology to improve knowledge management at the IDB in order to support efficiency in its operations and disseminate valuable knowledge and insights for the Region. For this reason, and in order to make all this information more accessible and also to solidify this information’s value to the Bank’s business, we developed a series of proofs of concepts (POCs) that use NLP and ML technologies. The purpose of this presentation is to share reflections gathered during the development of these POCs and the application of these types of approaches within the organization.

Speakers:

Kyle Strand, Lead Knowledge Management Specialist and Head of Library, Inter-American Development Bank (IDB)

Daniela Collaguazo, Text Analytics Consultant, Knowledge Innovation Communication Department, Inter-American Development Bank

Bertha Briceno, Lead Specialist, Knowledge and Learning Sector, Inter-American Development Bank

Machine Learning in Practice

In the last 10 years, most of the academic research on entity extraction and content classification has focused on machine learning and complete automation. The latest tools are very precise, but in academic publishing, the use of automatic classification tools is still controversial. Publishers and information managers want the best of both worlds: a clear list of defined, managed keywords for their content and a cost-effective way of implementing the subject tagging. This presentation reviews the current use of machine-learning tools in publishing, both with and without the use of manually curated taxonomies.

Speaker:

Michael Upshall, Head of Business Development, UNSILO, Denmark

Text Analytics & Taxonomy

Thursday, November 9: 1:00 p.m. - 1:45 p.m.

Bringing It All Together (At Last): Integrating Structured & Unstructured Information With Text Analytics & Ontologies

Organizations are always looking for better ways to integrate their structured (databases and reports) and unstructured (documents and webpages) information. This concept is not new; in fact, it has been the primary information management goal for many years. The difference is that today, the technology to make this happen has matured to the point that this is real. This talk shares real-life examples of how this is done in large repositories using text analytics and ontologies. Session attendees will understand what an ontology is and how it can be merged with text analytics tools to provide better analytics for their data scientists.

Speaker:

Zach Wahl, CEO, Enterprise Knowledge

Taxonomies & Text Analytics

This presentation discusses two recent projects where enterprise projects have benefited from direct interactions between taxonomies/ ontologies and text analytics. While these are often seen as competing work streams, our recent work continues to build on the idea that complex information-rich projects require both, and that pursuing one while abandoning the other often leads to poor results or project failure.

Speaker:

Gary Carlson, Founder, Factor

New Applications

Thursday, November 9: 2:00 p.m. - 2:45 p.m.

Human-Like Semantic Reasoning

To address the complexity of language ambiguity requires a technology that can read and understand text the way people do. This session explains the concepts behind linguistic analysis, word disambiguation, and semantic reasoning to read and understand content the way people do. It explains the concepts that support a semantic platform, demonstrates a semantic engine, explains how one mobile phone carrier deployed a self-help solution that automatically answered 24,000,000 customer questions annually with 94% precision, and shows a knowledge platform that automatically organizes hundreds of data sources and millions of unstructured documents around multiple corporate taxonomies and entity clusters using dynamically generated metadata in a precise and complete way.

Speaker:

Bryan Bell, Regional Vice President of Sales, Lucidworks

Breaking Down Silos With Text Analytics

The next phase of how we communicate has already started. Popularized by Siri, Alexa, and the like, natural language interaction (NLI) has achieved commercial Q&A success. For organizations looking to adopt new experiences with their customers, NLI holds promise. But there is a big difference between AI applications—the distinction is the degree to which they are intelligent. This talk examines the considerations for enterprise application of NLI and how to avoid applications that just drive more white noise.

Speaker:

Fiona McNeil, Global Technology Product Marketer, SAS

Application Issues

Thursday, November 9: 3:00 p.m. - 3:45 p.m.

Leveraging Text Analytics to Build Applications

In the world of scholarly publishing (as well as many other industries— such as KM/information conferences!), meeting organizers are inundated with submissions for inclusions in conference programs. Given a large set of submissions, how can we develop tools to cluster submitted manuscripts into tracks based on topical similarity? This talk describes a project that used a subject taxonomy, NLP, and other text analytics tools as well as a large corpus of documents to construct an application to cluster submitted manuscripts based on topical similarity, including a GUI interface to interact with and analyze the results. This is not intended as a detailed technical talk (no slides of code!), nor is it intended as a product spotlight; the focus is on using known/existing text analytics tools to construct purpose-built applications to solve specific document-centric problems.

Speaker:

Bob Kasenchak, Information Architect, Factor

Maximizing Analytic Value From Multi-Language Text Feeds

The universe of text analytics is largely constrained to the output of the entire human race. This can and does result in huge, petabyte- scale problems. Technologies for this scalability, computational distribution, deep learning, resolution, and semantic expression are all new within the last 10 years, and their combination is revolutionary. Key to putting all of this together is that the text analytics are performed in the native language of the original text, prior to the inevitable loss of fidelity in machine or human translation. This talk covers a number of use cases including counterterrorism, knowing your customer, border security, disease tracking and detection, and countering fake news and conspiracy theories.

Speaker:

Christopher Biow, SVP, Global Public Sector, Basis Technology

Closing Keynotes

Keynote - Creating Unified Views of Data With Semantic Graphs

Thursday, November 9: 4:00 p.m. - 4:15 p.m.

In recent years, document-centric search over information has been extended with the use of graph-based content and data models. The implementation of semantic knowledge graphs in enterprises is not only improving search in a traditional sense, but opens up a path of integrating all types of data sources in a most agile way. Linked data technologies have matured in recent years and can now be used as the basis for numerous critical tasks in enterprise information management. Hilger discusses how standards-based graph databases can be used for information integration, document classification, data analytics, and information visualization tasks. He shares how a semantic knowledge graph can be used to develop analytics applications on top of enterprise data lakes and illustrates how a large pharmaceutical company makes use of graph-based technologies to gain new insights into its research work from unified views and semantic search over heterogeneous data sources.

Speaker:

Joseph Hilger, COO, Enterprise Knowledge, LLC

Closing Keynote - KM in the Age of Digital Transformation: Magic Sauce for a Successful Future

Thursday, November 9: 4:15 p.m. - 5:00 p.m.

At the cross-section of innovation, open data, and education, our speaker, a former government KM practitioner, shares her thoughts about the challenges and opportunities for organizations and communities in the coming years. She discusses empowering members of our communities and improving services using new tech like AI, machine learning, virtual and augmented reality, Internet of Things, predictive analytics, gamification, and more. Are we moving toward anticipatory knowledge delivery (just enough, just in time, just for me), being in the flow of work at the teachable moment, establishing trust in a virtual environment, and learning from peer-to-peer marketplaces like Airbnb and Uber? Our longtime KM practitioner shares her insights about the evolving digital transformation of every part of our world and hints at the magic sauce we need for a successful future!

Speaker:

Jeanne Holm, Senior Technology Advisor to the Mayor, Deputy CIO at City of Los Angeles, Information Technology Agency, City of Los Angeles and UCLA, Open Data Collaboratives, International Academy of Astronautics

Program - Day 2 (Thursday, November 9, 2017)

Keynotes

Keynote - KM Buy-In: Proven Practices

Keynote - Beyond the Box: How Search is Driving Data Access in a Hybrid World

Luncheon Keynote - Cognitive Search & Analytics: What It Is & Why You Should Care

Track 1: Technical

Machine Learning, Taxonomy, Search

Machine Learning VS. Rules

Auto-Categorization & Summarization

Text & Data Together

Measuring The Results

Track 2: Business & Applications

Fake News & Bad Ad Placement

Case Studies II—Banks & Publishing

Text Analytics & Taxonomy

New Applications

Application Issues

Closing Keynotes

Keynote - Creating Unified Views of Data With Semantic Graphs

Closing Keynote - KM in the Age of Digital Transformation: Magic Sauce for a Successful Future

Co-Located With

Diamond Sponsors

Gold Sponsor

Tuesday Networking Reception Sponsor

Media Sponsors