Sales and Spending in the Health Care Industry



The recent release of data on payments made by drug and medical device companies to physicians and teaching hospitals provides for a unique opportunity to develop insights on trends in the health care industry. For example, what classes of drugs are heavily promoted? in what states are companies investing the most on promoting their products? how does industry spending on health care providers equate to sales within a region? These are just some of the questions that I believe can be answered from these disclosures.

Data in this analysis

The analysis outlined here utilizes the General Payments data for payments made to physicians and teaching hospitals, provided through the Open Payments website hosted by CMS, and covering the period January 2014 to December 2014. In addition, the analysis also includes data on total retail sales for prescription drugs in 2014 provided by the Henry J. Kaiser family foundation, and tables on state population and physician numbers per state for 2014 published by the Federation of State Medical Board. The following is a summary of the tables.

General Payments Data:

  • 10.78 million total number of records
  • 2.52 billion USD in total value of disclosures
  • 607,000 physicians receiving payments
  • 1,442 companies that made payments
  • 1,122 teaching hospitals receiving payments
  • 5.3 G file size

Drug Sales Data:

  • 259 billion USD in total value of sales
  • 51 total number of records

Physican Data on Demographics:

  • 318 million total US population
  • 916,264 licensed physicians
  • 51 states covered The tables are contained in a MySQL database


In [21]:
import MySQLdb
import pandas as pd
import csv
import numpy as np
import scipy as sp
from sqlalchemy import create_engine
from IPython.display import display
import datetime as dt
import matplotlib.pyplot as plt
pd.set_option('max_columns', 50)
%matplotlib inline

import plotly.plotly as py
from plotly.graph_objs import Bar, Scatter, Marker, Layout
from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot

import colorlover as cl #colorscale
from IPython.display import HTML
In [22]:
init_notebook_mode() #use plotly offline