Tracking main news entities and their relations through time
We are living in an era of information boosting. Huge amounts of textual content is produced daily by different news agencies around the world. News analysis can transform this data into valuable information.
In particular, obtaining insights about up-to-date relations between business related entities can contribute to minimising risks and maximising profits for companies.
Numerous tools aggregate and mine this data to extract relevant entities mentioned and to obtain sentiment insights. With technology advancement, these tools have increased in complexity and options provided. However, there has been a demand for tools that give a simple yet holistic summary of the searched topic in order to provide general insights.
In euBusinessGraph project we are working on the Relation Tracker tool that is based on the news data from Event Registry. Event Registry is a system for real-time collection, annotation and analysis of content published by global news outlets.
For relation extraction we take events from Event Registry and group them into topics. Within each topic, an interactive graph shows the main entities of each topic based on time and the type of relations between those entities. In addition, we visualize summary information about entities and their relationships.
The figure below presents a graph of entities with a broad relation between them characterised by a topic (described by Wikipedia concepts). For instance, the link between Apple Inc. and Wall Street is characterised by ‘Investing’.
How we do it
Our approach involves a number of steps, such as:
Cluster and format data: To group the events into topics, we use K-Means clustering algorithm. Each event is represented as a sparse vector of the non-entity concepts it has, with the weights equal to their scores in that event.
Choose the main entities: Under any topic, the top entities at each duration of time have to be selected.
Detecting the type of relationship: The type of relationship between two entities is based on the relevance of shared events between them. We used the non-entity concepts of the shared events in determining the relationship.
Characteristics of the main graph: The interactive network graph (as shown in the image above) has these features:
- The main entities within a topic at the selected time are represented by the vertices of the graph
- The size of the vertices reflects the importance value of each entity, scaled to a suitable ratio to fit in the canvas
- The colors represent the type of the entity, whether it is a person or an organisation
- The links between the entities represent the existence of shared events in that duration of time between them under the relation topic. The thickness of the links is proportional to the number of shared events, whereas the labels shows the type of relationship they have under previous conditions
What Elon Musk and Jeff Bezos have in common
The figure below presents a summary of relations for the entity Elon Musk and the entity Jeff Bezos. It can be observed that both entities occur in four shared events, which can be described by the shared characteristics, such as »Astronaut« and »International Space Station«.
The figure below shows a stream graph view. It shows top features of the relationship between Jeff Bezos and Elon Musk through time. It can be seen that in May 2015 a number of characteristics, such as »Astronaut«, »Internation Space Station«, »Spacecraft« and »Orbit« were present, while toward the end of year 2015 a number of other features (»Spaceflight«, »Cape Canaveral« and others) appeared.
Although at the first attempt we were able to detect the characterestics of the relationship between entities and how they are changing through time, the main types of relations are still quite broad, e.g. »Financial Services« comprise a lot of activities. We would for example also like to show how and when a company acquires another one. So, the aim is to improve the accuracy of detecting the main relation by using more detailed methods, such as deep learning.
News and Events enrichment
Ultimately, the relation tracker and its features shall be part of euBusinessGraph Marketplace services to provide valuable news and events insights.
If you are interested more in relation extraction from news, please contact the Artificial Intelligence Laboratory from Jozef Stefan Institute (firstname.lastname@example.org).