首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Twitter is home to many health professionals who send messages about a variety of health-related topics. Amid concerns about physicians posting inappropriate content online, more in-depth knowledge about these messages is needed to understand health professionals’ behavior on Twitter.

Objective

Our goal was to characterize the content of Twitter messages, specifically focusing on health professionals and their tweets relating to health.

Methods

We performed an in-depth content analysis of 700 tweets. Qualitative content analysis was conducted on tweets by health users on Twitter. The primary objective was to describe the general type of content (ie, health-related versus non-health related) on Twitter authored by health professionals and further to describe health-related tweets on the basis of the type of statement made. Specific attention was given to whether a tweet was personal (as opposed to professional) or made a claim that users would expect to be supported by some level of medical evidence (ie, a “testable” claim). A secondary objective was to compare content types among different users, including patients, physicians, nurses, health care organizations, and others.

Results

Health-related users are posting a wide range of content on Twitter. Among health-related tweets, 53.2% (184/346) contained a testable claim. Of health-related tweets by providers, 17.6% (61/346) were personal in nature; 61% (59/96) made testable statements. While organizations and businesses use Twitter to promote their services and products, patient advocates are using this tool to share their personal experiences with health.

Conclusions

Twitter users in health-related fields tweet about both testable claims and personal experiences. Future work should assess the relationship between testable tweets and the actual level of evidence supporting them, including how Twitter users—especially patients—interpret the content of tweets posted by health providers.  相似文献   

2.

Background

Online social media, such as the microblogging site Twitter, have become a space for speedy exchange of information regarding sexually transmitted diseases (STDs), presenting a potential risk environment for how STDs are portrayed. Examining the types of “tweeters” (users who post messages on Twitter) and the nature of “tweet” messages is important for identifying how information related to STDs is posted in online social media.

Objective

The intent of the study was to describe the types of message emitters on Twitter in relation to two different STDs—chlamydia and human immunodeficiency virus (HIV)—as well as the nature of content tweeted, including how seriously the topic was treated.

Methods

We used the Twitter search engine to look for tweets posted worldwide from August 1-7, 2013, and from September 1-7, 2013, containing the words “chlamydia” or “HIV”, and the hashtags “#chlamydia” or “#HIV”. Tweeters were classified by two independent reviewers according to the type of avatar of the user (human, logo, or fantasy), the identification of the emitter (identifiable, semi-identifiable, or non-identifiable), and the source (private company, general media, scientific media, non-governmental, individual account, academic institution, government department, or undefined). Tweet messages were also independently classified according to their nature (serious or jokes/funny), and whether their main message was factual or of a personal nature/experience.

Results

A total of 694 tweets were posted by 426 different users during the first 7 days of August and September, containing the hashtags and/or simple words “chlamydia” and/or “HIV”. Jokes or funny tweets were more frequently posted by individual users (89%, 66/74), with a human avatar (81%, 60/74), from a non-identifiable user (72%, 53/74), and they were most frequently related to chlamydia (76%, 56/74). Serious tweets were most frequently posted by the general media (20.6%, 128/620), using a logo avatar (66.9%, 415/620), and with identifiable accounts (85.2%, 528/620). No government departments, non-governmental organizations, scientific media, or academic institutions posted a joke on STDs. A total of 104 of these analyzed tweets were re-tweeted messages, belonging to 68 unique tweets. The content was serious (99%, 67/68), factual (90%, 52/58), and about HIV (85%, 58/68).

Conclusions

Social media such as Twitter may be an important source of information regarding STDs provided that the topic is presented appropriately. Reassuringly, the study showed that almost 9/10 of tweets on STDs (chlamydia and HIV) were of serious content, and many of the tweets that were re-tweeted were facts. The jokes that were tweeted were mainly about chlamydia, and posted by non-identifiable emitters. We believe social media should be used to an even larger extent to disseminate correct information about STDs.  相似文献   

3.

Background

Sleep issues such as insomnia affect over 50 million Americans and can lead to serious health problems, including depression and obesity, and can increase risk of injury. Social media platforms such as Twitter offer exciting potential for their use in studying and identifying both diseases and social phenomenon.

Objective

Our aim was to determine whether social media can be used as a method to conduct research focusing on sleep issues.

Methods

Twitter posts were collected and curated to determine whether a user exhibited signs of sleep issues based on the presence of several keywords in tweets such as insomnia, “can’t sleep”, Ambien, and others. Users whose tweets contain any of the keywords were designated as having self-identified sleep issues (sleep group). Users who did not have self-identified sleep issues (non-sleep group) were selected from tweets that did not contain pre-defined words or phrases used as a proxy for sleep issues.

Results

User data such as number of tweets, friends, followers, and location were collected, as well as the time and date of tweets. Additionally, the sentiment of each tweet and average sentiment of each user were determined to investigate differences between non-sleep and sleep groups. It was found that sleep group users were significantly less active on Twitter (P=.04), had fewer friends (P<.001), and fewer followers (P<.001) compared to others, after adjusting for the length of time each user''s account has been active. Sleep group users were more active during typical sleeping hours than others, which may suggest they were having difficulty sleeping. Sleep group users also had significantly lower sentiment in their tweets (P<.001), indicating a possible relationship between sleep and pyschosocial issues.

Conclusions

We have demonstrated a novel method for studying sleep issues that allows for fast, cost-effective, and customizable data to be gathered.  相似文献   

4.

Background

Marketing and use of electronic cigarettes (e-cigarettes) and other electronic nicotine delivery devices have increased exponentially in recent years fueled, in part, by marketing and word-of-mouth communications via social media platforms, such as Twitter.

Objective

This study examines Twitter posts about e-cigarettes between 2008 and 2013 to gain insights into (1) marketing trends for selling and promoting e-cigarettes and (2) locations where people use e-cigarettes.

Methods

We used keywords to gather tweets about e-cigarettes between July 1, 2008 and February 28, 2013. A randomly selected subset of tweets was manually coded as advertising (eg, marketing, advertising, sales, promotion) or nonadvertising (eg, individual users, consumers), and classification algorithms were trained to code the remaining data into these 2 categories. A combination of manual coding and natural language processing methods was used to indicate locations where people used e-cigarettes. Additional metadata were used to generate insights about users who tweeted most frequently about e-cigarettes.

Results

We identified approximately 1.7 million tweets about e-cigarettes between 2008 and 2013, with the majority of these tweets being advertising (93.43%, 1,559,508/1,669,123). Tweets about e-cigarettes increased more than tenfold between 2009 and 2010, suggesting a rapid increase in the popularity of e-cigarettes and marketing efforts. The Twitter handles tweeting most frequently about e-cigarettes were a mixture of e-cigarette brands, affiliate marketers, and resellers of e-cigarette products. Of the 471 e-cigarette tweets mentioning a specific place, most mentioned e-cigarette use in class (39.1%, 184/471) followed by home/room/bed (12.5%, 59/471), school (12.1%, 57/471), in public (8.7%, 41/471), the bathroom (5.7%, 27/471), and at work (4.5%, 21/471).

Conclusions

Twitter is being used to promote e-cigarettes by different types of entities and the online marketplace is more diverse than offline product offerings and advertising strategies. E-cigarettes are also being used in public places, such as schools, underscoring the need for education and enforcement of policies banning e-cigarette use in public places. Twitter data can provide new insights on e-cigarettes to help inform future research, regulations, surveillance, and enforcement efforts.  相似文献   

5.

Background

Groups and individuals that seek to negatively influence public opinion about the safety and value of vaccination are active in online and social media and may influence decision making within some communities.

Objective

We sought to measure whether exposure to negative opinions about human papillomavirus (HPV) vaccines in Twitter communities is associated with the subsequent expression of negative opinions by explicitly measuring potential information exposure over the social structure of Twitter communities.

Methods

We hypothesized that prior exposure to opinions rejecting the safety or value of HPV vaccines would be associated with an increased risk of posting similar opinions and tested this hypothesis by analyzing temporal sequences of messages posted on Twitter (tweets). The study design was a retrospective analysis of tweets related to HPV vaccines and the social connections between users. Between October 2013 and April 2014, we collected 83,551 English-language tweets that included terms related to HPV vaccines and the 957,865 social connections among 30,621 users posting or reposting the tweets. Tweets were classified as expressing negative or neutral/positive opinions using a machine learning classifier previously trained on a manually labeled sample.

Results

During the 6-month period, 25.13% (20,994/83,551) of tweets were classified as negative; among the 30,621 users that tweeted about HPV vaccines, 9046 (29.54%) were exposed to a majority of negative tweets. The likelihood of a user posting a negative tweet after exposure to a majority of negative opinions was 37.78% (2780/7361) compared to 10.92% (1234/11,296) for users who were exposed to a majority of positive and neutral tweets corresponding to a relative risk of 3.46 (95% CI 3.25-3.67, P<.001).

Conclusions

The heterogeneous community structure on Twitter appears to skew the information to which users are exposed in relation to HPV vaccines. We found that among users that tweeted about HPV vaccines, those who were more often exposed to negative opinions were more likely to subsequently post negative opinions. Although this research may be useful for identifying individuals and groups currently at risk of disproportionate exposure to misinformation about HPV vaccines, there is a clear need for studies capable of determining the factors that affect the formation and adoption of beliefs about public health interventions.  相似文献   

6.

Background

Public health agencies are actively using social media, including Twitter. In the public health and nonprofit sectors, Twitter has been limited to one-way communication. Two-way, interactive communication on Twitter has the potential to enhance organizational relationships with followers and help organizations achieve their goals by increasing communication and dialog between the organization and its followers. Research shows that nonprofit organizations use Twitter for three main functions: information sharing, community building, and action.

Objective

It is not known whether state health departments are using Twitter primarily for one-way information sharing or if they are trying to engage followers to build relationships and promote action. The purpose of this research was to discover what the primary function of Twitter use is among state health departments in the United States and whether this is similar to or different from nonprofit organizations.

Methods

A complete list of “tweets” made by each state health department account was obtained using the Twitter application programming interface. We randomly sampled 10% of each state health department’s tweets. Four research assistants hand-coded the tweets’ primary focus (organization centric or personal health information centric) and then the subcategories of information dissemination, engagement, or action. Research assistants coded each tweet for interactivity, sophistication, and redirects to another website. Data were analyzed using SPSS version 20.

Results

There were 4221 tweets from 39 state health departments. There was no statistically significant difference in the number of tweets made by a state health department and the state population density (P=.25). The majority of tweets focused on personal health topics (69.37%, 2928/4221) while one-third were tweets about the organization (29.14% , 1230/4221). The main function of organization-based tweets was engagement through conversations to build community (65.77%, 809/1236). These engagement-related tweets were primarily recognition of other organizations’ events (43.6%, 353/809) and giving thanks and recognition (21.4%, 173/809). Nearly all of the personal health information-centric tweets involved general public health information (92.10%, 1399/1519) and 79.03% (3336/4221) of tweets directed followers to another link for more information.

Conclusions

This is the first study to assess the purpose of public health tweets among state health departments. State health departments are using Twitter as a one-way communication tool, with tweets focused primarily on personal health. A state health department Twitter account may not be the primary health information source for individuals. Therefore, state health departments should reconsider their focus on personal health tweets and envision how they can use Twitter to develop relationships with community agencies and partners. In order to realize the potential of Twitter to establish relationships and develop connections, more two-way communication and interaction are essential.  相似文献   

7.

Background

Existing influenza surveillance in the United States is focused on the collection of data from sentinel physicians and hospitals; however, the compilation and distribution of reports are usually delayed by up to 2 weeks. With the popularity of social media growing, the Internet is a source for syndromic surveillance due to the availability of large amounts of data. In this study, tweets, or posts of 140 characters or less, from the website Twitter were collected and analyzed for their potential as surveillance for seasonal influenza.

Objective

There were three aims: (1) to improve the correlation of tweets to sentinel-provided influenza-like illness (ILI) rates by city through filtering and a machine-learning classifier, (2) to observe correlations of tweets for emergency department ILI rates by city, and (3) to explore correlations for tweets to laboratory-confirmed influenza cases in San Diego.

Methods

Tweets containing the keyword “flu” were collected within a 17-mile radius from 11 US cities selected for population and availability of ILI data. At the end of the collection period, 159,802 tweets were used for correlation analyses with sentinel-provided ILI and emergency department ILI rates as reported by the corresponding city or county health department. Two separate methods were used to observe correlations between tweets and ILI rates: filtering the tweets by type (non-retweets, retweets, tweets with a URL, tweets without a URL), and the use of a machine-learning classifier that determined whether a tweet was “valid”, or from a user who was likely ill with the flu.

Results

Correlations varied by city but general trends were observed. Non-retweets and tweets without a URL had higher and more significant (P<.05) correlations than retweets and tweets with a URL. Correlations of tweets to emergency department ILI rates were higher than the correlations observed for sentinel-provided ILI for most of the cities. The machine-learning classifier yielded the highest correlations for many of the cities when using the sentinel-provided or emergency department ILI as well as the number of laboratory-confirmed influenza cases in San Diego. High correlation values (r=.93) with significance at P<.001 were observed for laboratory-confirmed influenza cases for most categories and tweets determined to be valid by the classifier.

Conclusions

Compared to tweet analyses in the previous influenza season, this study demonstrated increased accuracy in using Twitter as a supplementary surveillance tool for influenza as better filtering and classification methods yielded higher correlations for the 2013-2014 influenza season than those found for tweets in the previous influenza season, where emergency department ILI rates were better correlated to tweets than sentinel-provided ILI rates. Further investigations in the field would require expansion with regard to the location that the tweets are collected from, as well as the availability of more ILI data.  相似文献   

8.
9.
10.

Background

User content posted through Twitter has been used for biosurveillance, to characterize public perception of health-related topics, and as a means of distributing information to the general public. Most of the existing work surrounding Twitter and health care has shown Twitter to be an effective medium for these problems but more could be done to provide finer and more efficient access to all pertinent data. Given the diversity of user-generated content, small samples or summary presentations of the data arguably omit a large part of the virtual discussion taking place in the Twittersphere. Still, managing, processing, and querying large amounts of Twitter data is not a trivial task. This work describes tools and techniques capable of handling larger sets of Twitter data and demonstrates their use with the issue of antibiotics.

Objective

This work has two principle objectives: (1) to provide an open-source means to efficiently explore all collected tweets and query health-related topics on Twitter, specifically, questions such as what users are saying and how messages are spread, and (2) to characterize the larger discourse taking place on Twitter with respect to antibiotics.

Methods

Open-source software suites Hadoop, Flume, and Hive were used to collect and query a large number of Twitter posts. To classify tweets by topic, a deep network classifier was trained using a limited number of manually classified tweets. The particular machine learning approach used also allowed the use of a large number of unclassified tweets to increase performance.

Results

Query-based analysis of the collected tweets revealed that a large number of users contributed to the online discussion and that a frequent topic mentioned was resistance. A number of prominent events related to antibiotics led to a number of spikes in activity but these were short in duration. The category-based classifier developed was able to correctly classify 70% of manually labeled tweets (using a 10-fold cross validation procedure and 9 classes). The classifier also performed well when evaluated on a per category basis.

Conclusions

Using existing tools such as Hive, Flume, Hadoop, and machine learning techniques, it is possible to construct tools and workflows to collect and query large amounts of Twitter data to characterize the larger discussion taking place on Twitter with respect to a particular health-related topic. Furthermore, using newer machine learning techniques and a limited number of manually labeled tweets, an entire body of collected tweets can be classified to indicate what topics are driving the virtual, online discussion. The resulting classifier can also be used to efficiently explore collected tweets by category and search for messages of interest or exemplary content.  相似文献   

11.

Background

Twitter has shown some usefulness in predicting influenza cases on a weekly basis in multiple countries and on different geographic scales. Recently, Broniatowski and colleagues suggested Twitter’s relevance at the city-level for New York City. Here, we look to dive deeper into the case of New York City by analyzing daily Twitter data from temporal and spatiotemporal perspectives. Also, through manual coding of all tweets, we look to gain qualitative insights that can help direct future automated searches.

Objective

The intent of the study was first to validate the temporal predictive strength of daily Twitter data for influenza-like illness emergency department (ILI-ED) visits during the New York City 2012-2013 influenza season against other available and established datasets (Google search query, or GSQ), and second, to examine the spatial distribution and the spread of geocoded tweets as proxies for potential cases.

Methods

From the Twitter Streaming API, 2972 tweets were collected in the New York City region matching the keywords “flu”, “influenza”, “gripe”, and “high fever”. The tweets were categorized according to the scheme developed by Lamb et al. A new fourth category was added as an evaluator guess for the probability of the subject(s) being sick to account for strength of confidence in the validity of the statement. Temporal correlations were made for tweets against daily ILI-ED visits and daily GSQ volume. The best models were used for linear regression for forecasting ILI visits. A weighted, retrospective Poisson model with SaTScan software (n=1484), and vector map were used for spatiotemporal analysis.

Results

Infection-related tweets (R=.763) correlated better than GSQ time series (R=.683) for the same keywords and had a lower mean average percent error (8.4 vs 11.8) for ILI-ED visit prediction in January, the most volatile month of flu. SaTScan identified primary outbreak cluster of high-probability infection tweets with a 2.74 relative risk ratio compared to medium-probability infection tweets at P=.001 in Northern Brooklyn, in a radius that includes Barclay’s Center and the Atlantic Avenue Terminal.

Conclusions

While others have looked at weekly regional tweets, this study is the first to stress test Twitter for daily city-level data for New York City. Extraction of personal testimonies of infection-related tweets suggests Twitter’s strength both qualitatively and quantitatively for ILI-ED prediction compared to alternative daily datasets mixed with awareness-based data such as GSQ. Additionally, granular Twitter data provide important spatiotemporal insights. A tweet vector-map may be useful for visualization of city-level spread when local gold standard data are otherwise unavailable.  相似文献   

12.

Background

Social media platforms such as Twitter are rapidly becoming key resources for public health surveillance applications, yet little is known about Twitter users’ levels of informedness and sentiment toward tobacco, especially with regard to the emerging tobacco control challenges posed by hookah and electronic cigarettes.

Objective

To develop a content and sentiment analysis of tobacco-related Twitter posts and build machine learning classifiers to detect tobacco-relevant posts and sentiment towards tobacco, with a particular focus on new and emerging products like hookah and electronic cigarettes.

Methods

We collected 7362 tobacco-related Twitter posts at 15-day intervals from December 2011 to July 2012. Each tweet was manually classified using a triaxial scheme, capturing genre, theme, and sentiment. Using the collected data, machine-learning classifiers were trained to detect tobacco-related vs irrelevant tweets as well as positive vs negative sentiment, using Naïve Bayes, k-nearest neighbors, and Support Vector Machine (SVM) algorithms. Finally, phi contingency coefficients were computed between each of the categories to discover emergent patterns.

Results

The most prevalent genres were first- and second-hand experience and opinion, and the most frequent themes were hookah, cessation, and pleasure. Sentiment toward tobacco was overall more positive (1939/4215, 46% of tweets) than negative (1349/4215, 32%) or neutral among tweets mentioning it, even excluding the 9% of tweets categorized as marketing. Three separate metrics converged to support an emergent distinction between, on one hand, hookah and electronic cigarettes corresponding to positive sentiment, and on the other hand, traditional tobacco products and more general references corresponding to negative sentiment. These metrics included correlations between categories in the annotation scheme (phihookah-positive=0.39; phie-cigs-positive=0.19); correlations between search keywords and sentiment (χ2 4=414.50, P<.001, Cramer’s V=0.36), and the most discriminating unigram features for positive and negative sentiment ranked by log odds ratio in the machine learning component of the study. In the automated classification tasks, SVMs using a relatively small number of unigram features (500) achieved best performance in discriminating tobacco-related from unrelated tweets (F score=0.85).

Conclusions

Novel insights available through Twitter for tobacco surveillance are attested through the high prevalence of positive sentiment. This positive sentiment is correlated in complex ways with social image, personal experience, and recently popular products such as hookah and electronic cigarettes. Several apparent perceptual disconnects between these products and their health effects suggest opportunities for tobacco control education. Finally, machine classification of tobacco-related posts shows a promising edge over strictly keyword-based approaches, yielding an improved signal-to-noise ratio in Twitter data and paving the way for automated tobacco surveillance applications.  相似文献   

13.

Background

One of the essential services provided by the US local health departments is informing and educating constituents about health. Communication with constituents about public health issues and health risks is among the standards required of local health departments for accreditation. Past research found that only 61% of local health departments met standards for informing and educating constituents, suggesting a considerable gap between current practices and best practice.

Objective

Social media platforms, such as Twitter, may aid local health departments in informing and educating their constituents by reaching large numbers of people with real-time messages at relatively low cost. Little is known about the followers of local health departments on Twitter. The aim of this study was to examine characteristics of local health department Twitter followers and the relationship between local health department characteristics and follower characteristics.

Methods

In 2013, we collected (using NodeXL) and analyzed a sample of 4779 Twitter followers from 59 randomly selected local health departments in the United States with Twitter accounts. We coded each Twitter follower for type (individual, organization), location, health focus, and industry (eg, media, government). Local health department characteristics were adopted from the 2010 National Association of City and County Health Officials Profile Study data.

Results

Local health department Twitter accounts were followed by more organizations than individual users. Organizations tended to be health-focused, located outside the state from the local health department being followed, and from the education, government, and non-profit sectors. Individuals were likely to be local and not health-focused. Having a public information officer on staff, serving a larger population, and “tweeting” more frequently were associated with having a higher percentage of local followers.

Conclusions

Social media has the potential to reach a wide and diverse audience. Understanding audience characteristics can help public health organizations use this new tool more effectively by tailoring tweet content and dissemination strategies for their audience.  相似文献   

14.

Background

Twitter is a popular social media forum for sharing personal experiences, interests, and opinions. An improved understanding of the discourse on Twitter that encourages marijuana use can be helpful for tailoring and targeting online and offline prevention messages.

Objectives

The intent of the study was to assess the content of “tweets” and the demographics of followers of a popular pro-marijuana Twitter handle (@stillblazingtho).

Methods

We assessed the sentiment and content of tweets (sent from May 1 to December 31, 2013), as well as the demographics of consumers that follow a popular pro-marijuana Twitter handle (approximately 1,000,000 followers) using Twitter analytics from Demographics Pro. This analytics company estimates demographic characteristics based on Twitter behavior/usage, relying on multiple data signals from networks, consumption, and language and requires confidence of 95% or above to make an estimate of a single demographic characteristic.

Results

A total of 2590 tweets were sent from @stillblazingtho during the 8-month period and 305 (11.78%) replies to another Twitter user were excluded for qualitative analysis. Of the remaining 2285 tweets, 1875 (82.06%) were positive about marijuana, 403 (17.64%) were neutral, and 7 (0.31%) appeared negative about marijuana. Approximately 1101 (58.72%) of the positive marijuana tweets were perceived as jokes or humorous, 340 (18.13%) implied that marijuana helps you to feel good or relax, 294 (15.68%) mentioned routine, frequent, or heavy use, 193 (10.29%) mentioned blunts, marijuana edibles, or paraphernalia (eg, bongs, vaporizers), and 186 (9.92%) mentioned other risky health behaviors (eg, tobacco, alcohol, other drugs, sex). The majority (699,103/959,143; 72.89%) of @stillblazingtho followers were 19 years old or younger. Among people ages 17 to 19 years, @stillblazingtho was in the top 10% of all Twitter handles followed. More followers of @stillblazingtho in the United States were African American (323,107/759,407; 42.55%) or Hispanic (90,732/759,407; 11.95%) than the Twitter median average (African American 22.4%, inter-quartile ratio [IQR] 5.1-62.5%; Hispanic 5.4%, IQR 3.0-10.8%) and among Hispanics, @stillblazingtho was in the top 30% of all Twitter handles followed.

Conclusions

Young people are especially responsive to social media influences and often establish substance use patterns during this phase of development. Our findings underscore the need for surveillance efforts to monitor the pro-marijuana content reaching young people on Twitter.  相似文献   

15.
16.

Background

Biomedical research has traditionally been conducted via surveys and the analysis of medical records. However, these resources are limited in their content, such that non-traditional domains (eg, online forums and social media) have an opportunity to supplement the view of an individual’s health.

Objective

The objective of this study was to develop a scalable framework to detect personal health status mentions on Twitter and assess the extent to which such information is disclosed.

Methods

We collected more than 250 million tweets via the Twitter streaming API over a 2-month period in 2014. The corpus was filtered down to approximately 250,000 tweets, stratified across 34 high-impact health issues, based on guidance from the Medical Expenditure Panel Survey. We created a labeled corpus of several thousand tweets via a survey, administered over Amazon Mechanical Turk, that documents when terms correspond to mentions of personal health issues or an alternative (eg, a metaphor). We engineered a scalable classifier for personal health mentions via feature selection and assessed its potential over the health issues. We further investigated the utility of the tweets by determining the extent to which Twitter users disclose personal health status.

Results

Our investigation yielded several notable findings. First, we find that tweets from a small subset of the health issues can train a scalable classifier to detect health mentions. Specifically, training on 2000 tweets from four health issues (cancer, depression, hypertension, and leukemia) yielded a classifier with precision of 0.77 on all 34 health issues. Second, Twitter users disclosed personal health status for all health issues. Notably, personal health status was disclosed over 50% of the time for 11 out of 34 (33%) investigated health issues. Third, the disclosure rate was dependent on the health issue in a statistically significant manner (P<.001). For instance, more than 80% of the tweets about migraines (83/100) and allergies (85/100) communicated personal health status, while only around 10% of the tweets about obesity (13/100) and heart attack (12/100) did so. Fourth, the likelihood that people disclose their own versus other people’s health status was dependent on health issue in a statistically significant manner as well (P<.001). For example, 69% (69/100) of the insomnia tweets disclosed the author’s status, while only 1% (1/100) disclosed another person’s status. By contrast, 1% (1/100) of the Down syndrome tweets disclosed the author’s status, while 21% (21/100) disclosed another person’s status.

Conclusions

It is possible to automatically detect personal health status mentions on Twitter in a scalable manner. These mentions correspond to the health issues of the Twitter users themselves, but also other individuals. Though this study did not investigate the veracity of such statements, we anticipate such information may be useful in supplementing traditional health-related sources for research purposes.  相似文献   

17.
18.
19.

Background

Surveillance plays a vital role in disease detection, but traditional methods of collecting patient data, reporting to health officials, and compiling reports are costly and time consuming. In recent years, syndromic surveillance tools have expanded and researchers are able to exploit the vast amount of data available in real time on the Internet at minimal cost. Many data sources for infoveillance exist, but this study focuses on status updates (tweets) from the Twitter microblogging website.

Objective

The aim of this study was to explore the interaction between cyberspace message activity, measured by keyword-specific tweets, and real world occurrences of influenza and pertussis. Tweets were aggregated by week and compared to weekly influenza-like illness (ILI) and weekly pertussis incidence. The potential effect of tweet type was analyzed by categorizing tweets into 4 categories: nonretweets, retweets, tweets with a URL Web address, and tweets without a URL Web address.

Methods

Tweets were collected within a 17-mile radius of 11 US cities chosen on the basis of population size and the availability of disease data. Influenza analysis involved all 11 cities. Pertussis analysis was based on the 2 cities nearest to the Washington State pertussis outbreak (Seattle, WA and Portland, OR). Tweet collection resulted in 161,821 flu, 6174 influenza, 160 pertussis, and 1167 whooping cough tweets. The correlation coefficients between tweets or subgroups of tweets and disease occurrence were calculated and trends were presented graphically.

Results

Correlations between weekly aggregated tweets and disease occurrence varied greatly, but were relatively strong in some areas. In general, correlation coefficients were stronger in the flu analysis compared to the pertussis analysis. Within each analysis, flu tweets were more strongly correlated with ILI rates than influenza tweets, and whooping cough tweets correlated more strongly with pertussis incidence than pertussis tweets. Nonretweets correlated more with disease occurrence than retweets, and tweets without a URL Web address correlated better with actual incidence than those with a URL Web address primarily for the flu tweets.

Conclusions

This study demonstrates that not only does keyword choice play an important role in how well tweets correlate with disease occurrence, but that the subgroup of tweets used for analysis is also important. This exploratory work shows potential in the use of tweets for infoveillance, but continued efforts are needed to further refine research methods in this field.  相似文献   

20.

Background

Twitter provides various types of location data, including exact Global Positioning System (GPS) coordinates, which could be used for infoveillance and infodemiology (ie, the study and monitoring of online health information), health communication, and interventions. Despite its potential, Twitter location information is not well understood or well documented, limiting its public health utility.

Objective

The objective of this study was to document and describe the various types of location information available in Twitter. The different types of location data that can be ascertained from Twitter users are described. This information is key to informing future research on the availability, usability, and limitations of such location data.

Methods

Location data was gathered directly from Twitter using its application programming interface (API). The maximum tweets allowed by Twitter were gathered (1% of the total tweets) over 2 separate weeks in October and November 2011. The final dataset consisted of 23.8 million tweets from 9.5 million unique users. Frequencies for each of the location options were calculated to determine the prevalence of the various location data options by region of the world, time zone, and state within the United States. Data from the US Census Bureau were also compiled to determine population proportions in each state, and Pearson correlation coefficients were used to compare each state’s population with the number of Twitter users who enable the GPS location option.

Results

The GPS location data could be ascertained for 2.02% of tweets and 2.70% of unique users. Using a simple text-matching approach, 17.13% of user profiles in the 4 continental US time zones were able to be used to determine the user’s city and state. Agreement between GPS data and data from the text-matching approach was high (87.69%). Furthermore, there was a significant correlation between the number of Twitter users per state and the 2010 US Census state populations (r ≥ 0.97, P < .001).

Conclusions

Health researchers exploring ways to use Twitter data for disease surveillance should be aware that the majority of tweets are not currently associated with an identifiable geographic location. Location can be identified for approximately 4 times the number of tweets using a straightforward text-matching process compared to using the GPS location information available in Twitter. Given the strong correlation between both data gathering methods, future research may consider using more qualitative approaches with higher yields, such as text mining, to acquire information about Twitter users’ geographical location.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号