Sentiment and Opinion Analysis are increasingly being used as synonyms, and even Wikipedia provides the following definition of the former: “Sentiment analysis (also known as opinion mining) refers to the use […]”.

However, there are substantial differences between the two, since identifying the mood of a text is different than extracting opinions. Let’s think about the following phrase:

These cookies are very good”.

Here it is relatively easy to label it with the correct mood, which is “POSITIVE”, but what about the sentence below?

This company makes very good cookies, but they contain many fats and are too expensive.

Most Sentiment Analysis tools would forcibly label the above through an algorithm based on generic semantic interpretation rules and, in the best case, they would attribute a “NEUTRAL” (or “POSITIVE”) mood to it.

Now, here it is clear that such choice not only does not reflect reality accurately but provides indications that are potentially opposite from the actual sentiment expressed.

Forcing a label on a mood “at any costs” is one of the main reasons why most sentiment analyses show a high percentage of neutrality (above 60-70%), and this should ring a bell into the recipients: if it is true that people write on social networks to express their opinions or share information, then it would be quite odd to conclude a survey asserting that most posts are “NEUTRAL”.

Such a result should not be blamed (just) on the tools or the algorithms used in the analysis: now try yourself to attribute a sentiment to the sentence at issue. I am sure that when asked the question “What mood would you choose to label the above sentence?”, most people would answer “IT DEPENDS”.

And this is just where the problem lies: because a tool for analysis cannot afford to answer “IT DEPENDS”, and in the absence of precise instructions, all it can do is to force an answer, although it would be a blatantly inaccurate one.

Let’s now try to examine the afore-mentioned sentence by setting a well-defined condition: what would our answer be if the question was formulated as follows: “What is the opinion on the price of the cookies?“ or “What is the opinion on the taste of the cookies?“; I am pretty sure that in this case instead of “IT DEPENDS”, the answer would be a precise label (or category).

Well, accuracy of answers is one of the substantial differences between Sentiment Analysis and Opinion Analysis, though not the only one.

Any time we ask an opinion to somebody, we expect a reasoned answer or, in any case, the precise expression of a personal thought. I will use another example to explain the concept; if you ask a question like: “Why do you like these cookies?” what kind of answer would you expect? Most likely a “positive perception” or structured comment, like “I like these cookies because they taste like butter”.

This is another difference between Sentiment Analysis and Opinion Analysis: while in the former the maximum we can obtain is a more or less precise representation of the mood on a single and well circumstantiated topic, the latter actually provides answers to our questions.

By way of example, you can see a possible representation of an opinion analysis below. The analysis here is on the perception of users towards the customer support service of a manufacturing company operating in the consumer electronics business:

schema2

As you can see, regardless of the specific case, Opinion Analysis not only allows us to determine the mood but also, and especially, to comprehend the underlying reasons.

One of the greatest difficulties when listening to the web is to separate opinions from mere news, namely posts or articles that do not reveal any opinions, but are simply meant to share information, or to separate opinions from the communications of the brand we are investigating.

In this case, Sentiment Analysis is not enough to understand what people think about a given brand, product or service, as what we need is an Opinion Analysis.

The following picture shows the graphic effect of the afore-mentioned approach applied to a survey carried out during the launch of a consumer product:

schema1

The first thing that is clear is how the volumes at stake are strikingly different and, mostly, how the number of posts in which users express their actual opinions on a given subject is considerably lower than the number of posts offering general information but without showing the expression of end-users, which is exactly what brands are looking for when analyzing their customer base and potential customers.