The Strange Case of Pneumonia in 2019

PART I

What Social Media Analysis can tell us about Early Detection of COVID-19, Legionnaires’ Disease and other Anomalies.

The analysis of social media data is a very powerful tool for the early detection of key events. Studying specific keywords on Twitter suggests the presence of a strange increase in cases of pneumonia in late 2019, most likely linked with the COVID-19 pandemic (an interesting idea outlined in a Nature paper in January 2021).

The analysis focuses on the number of tweets & users mentioning specific keywords during a certain period of time. To showcase the power of our tools we have reproduced and integrated part of that analysis by adding our natural language processing tools.
We have focused on two countries: Italy and France, with a total of about 14,000 unique tweets. For those interested in the statistical methods and datasets, we refer to the original paper.

Let’s first have a look at the (normalized) cumulative number of tweets reported in Italy and France from January 1, 2018 to April 1, 2020. Some steep spikes are seen in early 2020 for both countries, with Italy showing some strange knees with sudden increases in the number of tweets during the summers of 2018 and 2019. Nothing like that is seen in France. We will come back on those knees later on, but for now let’s focus on the spike in early 2020.

Cumulative Tweets: Italy

The plot shows the cumulative number of tweets reporting the word “polmonite” (pneumonia) from January 1, 2018, until April 1, 2020. Two knees are seen in summer 2018 and 2019, plus a steep spike in early 2020.

Cumulative Tweets: France

The plot shows the cumulative number of tweets reporting the word “pneumonie” (pneumonia) from January 1, 2018 until April 1, 2020. A steep spike is seen in early 2020 with no other anomaly.

To better understand what is happening, we first look at the number of unique users tweeting in Italy and France during the considered timeline. If nothing relevant is happening, we should expect a random fluctuation in the number of users. Also, we should see no specific difference between the number of users in 2018-2019 and 2019-2020. However, we detect a clear difference between the total number of users in early 2020 and those in early 2019. As early as November 2019, France actually shows a significant increase in the number of users mentioning pneumonia. Furthermore, something strange is detected again in Italy during the summers of 2018 and 2019 with two bumps (see figures below) that match the knees seen in the cumulative number of tweets in the plots above.

Unique Twitter Users: Italy

The number of unique users using the Italian language and mentioning the keyword “polmonite”. The x-axis indicates the month. The red line refers to the period 03-2019 until the end of 02-2020. The black line from 03-2018 until the end of 02-2019.

Unique Twitter Users: France

Number of unique users using the French language and mentioning the keyword “pneumonie”. The x-axis indicates the month. The red line refers to the period 03-2019 until end of 02-2020. The black line to 03-2018 until end of 02-2019.

According to the method outlined in the Nature paper (Lopriete et al. 2021), one should be able to detect these anomalies when looking at the p-values of the cumulative distributions of the number of tweets. We have repeated their analysis and the result we obtained is compatible with that reported by the authors. Note that we haven’t made any changes to their method of analysis until this moment. This analysis has been done only for the purpose of reproducing their results.

The news reported in Italy about the Legionnaires’ disease outbreak in September 2018.

So far this seems to qualitatively reproduce the results of the paper. However, we want to go some step further and investigate what happened in Italy during the summers of 2018 and 2019. The first interesting result is that between July and October 2018 there were multiple Legionnaires’ disease outbreaks in Italy, broadly reported by the press. The first knee and bump seen on Twitter for Italy, seems to be roughly coincident with the occurrence of those cases.

What about 2019? An inspection of the tweets in August 2019 shows that the second knee and bump is coincident to the time when Maurizio Sarri, at the time coach of the Italian football team Juventus F.C., was reported with a severe case of pneumonia. The comparison of the cumulative distribution of tweets in the two years considered also shows an anomaly around the end of August 2019, in agreement with the findings.

Manifesto Z01

L'intelligenza artificiale (AI) è uno strumento estremamente potente se usato correttamente. Il suo potenziale può influenzare la vita delle persone in molti modi diversi e accelerare il progresso umano in maniera esponenziale. L'AI oggi aiuta a risolvere problemi legati alla medicina, rende le nostre case più intelligenti, incrementa la sicurezza e libera gli individui da azioni ripetitive e monotone. L'AI ha il potenziale giusto per diventare lo strumento al servizio dell'uguaglianza e della libertà, spingendo la condizione umana verso orizzonti più vasti.

Durante gli ultimi dieci anni, l'AI ha cominciato a mostrare il suo vero potenziale avendo ora un impatto concreto nella nostra vita di tutti i giorni. Il suo utilizzo può però essere anche rivolto verso scopi meno nobili che confliggono con l'idea di progresso. L'AI può infatti essere utilizzata anche per scopi criminali, per manipolare la realtà e accelerare la diffusione di notizie false, creare deepfakes e generare attacchi informatici efficaci.

Qui di seguito dichiariamo le nostre ambizioni come startup che si affaccia al mondo dell'AI.


Ci impegniamo a combattere l'utilizzo dell'AI per scopi dannosi cercando soluzioni alla radice del problema.


Siamo impegnati ad investire il nostro tempo per trovare soluzioni che migliorino la società a beneficio di tutti.


Ci impegniamo a migliorare le tecnologie esistenti esplorando nuovi strumenti assieme alla comunità scientifica.


Ci impegniamo ad utilizzare eticamente l'intelligenza artificiale per migliorare la qualità della vita, rispettare la privacy e promuovere la conoscenza.

it_ITItalian