Coverage: Insius does not rely on a limited source set, but instead searches all websites publicly available to search engines using its own crawler infrastructure. This leads to a maximum coverage while simultaneously returning relevant results, as only content that can be retrieved by search engines can also be found by an information-seeking user - and potentially influence him.
Complete posts: Our systems do not only index RSS feeds or treat webpages as one single article. Insius crawlers use computer-vision techniques to detect and collect separate articles within webpages (e.g. articles within threads) and remove boilerplate content such as navigation, ads, and surveys.
Duplicate and content spam detection: A duplicate detection system removes copied and duplicate posts from the dataset ensuring the same content is only counted once. This method is so robust and accurate that even modifications of the textual content will not affect proper detection.
Detection of user-generated and topic-relevant Web content
User-generated content: The Web contains content generated by users in private contexts, but also professional and editorial content. Both kinds are often mixed even on a single webpage, such as user comments next to a product description. In order to really measure the voice of the consumer, it is most important to be able to properly differentiate between user-generated and editorial content. Insius successfully implemented methods to automatically distinguish between user-generated and professional content, delivering only the content that is actually relevant to you.
Topic relevant content: There is no easy way to define topics properly in a machine-readable way. A keyword like "Continental" may lead to content relating to an airline, a tire manufacturer, and others. A usual approach is to use a Boolean operator such as "Continental AND (tire OR car) NOT airline" to get only the fragment of results matching your scope. Because of the complexity of human language, this method is of limited use. It either doesn't catch all the relevant results or matches too many irrelevant ones. Through the use of advanced information retrieval methods, Insius is capable of properly distinguishing between topics without the need for error-prone manual drafts of Boolean search queries.
Domain-sensitive sentiment analysis: Sentiment analysis is error prone if the different meaning of terms in distinct domains is neglected. Insius sentiment analysis evaluates any sentiment within its context: While a "long battery life" is positive, a "long waiting time" may be negative. Of course, negations such as in "not expensive" are also properly detected and taken into account, in this example leading to a positive sentiment.
Fine-grained sentiment detection on the level of statements and concepts: Unlike common sentiment analysis tools, Insius does not measure sentiment (positive, negative or neutral) at the level of posts or articles. To us, this is by far too coarse as posts usually don't only have a single polarity as a whole. Instead, with the use of natural language processing techniques, our algorithms are able to identify single concepts and statements which bear a certain polarity.
Driver analysis: Besides finding concepts the Insius driver analysis is able to tell why a certain concept is perceived positively or negatively. Knowing the exact reason is essential for recommending proper action.
Visualizing perception as a network map
Using natural language processing concepts that describe topics which are highly relevant for consumers are thus detected and visualized. The topic's strength may be easily read measuring its distance to the center. The more important the topic, the more centrally it is located.
Visualizing consumers' thought patterns
When analyzing a large number of online consumer discussions patterns might be found from a birds-eye-view. Insius Methods are able to detect those patterns of concepts and topics and reveal which of those topics are eventually connected in the minds of the online discussing consumers and describe thought patterns. This information allows you to detect which topics might be triggered by marketing activities to activate the maximum number of positive associations.