Mathew Kiang (.com) | Yes, with one "t".

Collaboration network: 2025 edition

January 9, 2026

Looking back on what I think can be fairly described as a chaotic year (professionally), three events really encapsulated what it was like being in academia.

In January (2025), I was on an NIH study section that got cancelled as the new administration shook up HHS/CDC/NIH and did some mass grant cancellations. It eventually got rescheduled but I wasn’t able to attend — I can only assume it was as hectic as one would imagine.
Then in April, I was excited to start working on my second NASEM consensus committee. It got abruptly cancelled by the CDC the morning of our first meeting. This was, notably, after the whole committee was flown in from across the country the night before so couldn’t possibly have been a cost saving measure. We were never given a reason for the cancellation but I can only assume genetic blood disorders are woke.
Then in October, I was on another NIH study section that got cancelled — this time due to the government shutdown. It’s scheduled for later this month, but as a result of the condensed timeline, we’re reviewing ~40% fewer grants than we would normally discuss in study section.

A walk in the park

December 1, 2025

San Francisco has a lot of parks. Depending on who you ask, there are 230, 280, or 510 different parks, playgrounds, and green spaces spread across SF’s 47 square miles of land. In 2017, San Francisco became the first US city where every resident lived within a 10 minute walk of a park. Since then, many other cities have joined the ranks and SF isn’t even in the top 5 US cities anymore with DC, Irvine, Minneapolis, Cincinnati, and St. Paul taking up those spots.

Still, every resident within 10 minutes walking distance to a park is quite the feat — especially since San Francisco has a larger population (about 820,000 people), a smaller land area, and more complex topology (~1,000 feet elevation change) than the other places.

As somebody who frequents the local parks quite often, one thing the nearest-walking-distance metric doesn’t capture is the diversity of parks in San Francisco. For example, within 10 minutes of my apartment, there are parks where you can listen to stadium concerts for free, enjoy a manmade beach that looks across the bay, go to one of the largest kids’ playgrounds in the city, picnic to live entertainment during a Warriors game, or grab a beer and lunch from some food trucks.

This did make me a bit curious though — is my neighborhood unique?

Debriefing our JAMA paper on vaccination in the US

May 14, 2025

About three weeks ago, our paper modeling the reemergence of vaccine-eliminated diseases in the US was published. We used microsimulation to evaluate different scenarios of vaccine decline (and increase) in the US for four vaccine-eliminated pathogens: measles, rubella, diphtheria, and polio. The simulation incorporates state-age-specific mortality, state-specific birth rates, state-age-specific vaccination rates, and known disease parameters. Needless to say, it was a timely paper, and we hope it will be useful as the administration considers changes to the childhood vaccination schedule and vaccine approvals/procedures.

Obviously, I think the paper itself is worth reading but there were a lot of higher level / meta aspects of the paper I thought were notable.

Catching up + Collaboration network update: 2010-2024

March 18, 2025

You wouldn’t know it from the lack of updates on this blog, but it’s been a busy couple years. I’m sure I’m missing many important things, but a few highlights include:

Ben’s JAMA Peds paper got a lot of media attention, which prompted action from Senators Duckworth and Durbin.
A student paper on sub-national excess overdose estimates got published in AJPH after four years — just proving how resilient and persistent the new crop of PhD students are.
Keith and I wrote a thing about how the “higher” or “lower” mortality really depends on your comparison group.
Our JAMA paper estimating parental deaths due to drug poisoning and firearms was cited in the US Surgeon General’s Advisory on Firearm Violence, which was deleted (today) by the current administration. Thankfully, I’m vain and saved a copy of the PDF, which you can find here.
Monica and I wrote a thing about how we need better data to understand drug use — especially timely as public health data get removed, discontinued, or paused.
A paper with Holly (that went through at least 8 rejections) was finally accepted, wrapping up a 3 year long project.
After a couple dozen meetings, a dozen or so flights, and 2.5 years of work, the NASEM Consensus Committee report I was working on is finally out.

I was on a National Academies consensus committee — our report finally is now out!

February 10, 2025

Analyzing San Francisco’s parking citation data

January 6, 2025

San Francisco issues about 5,000 parking citations per day, resulting in over 21.4 million parking citations to over 5.2 million unique license plates since January 2008. That’s a pretty astounding number of parking citations when you consider San Francisco is home to less than a million people, less than half a million registered cars, and covers an area that’s just 7 miles by 7 miles. Parking citations are handled by San Francisco MTA while moving violations are handled by the San Francisco Police Department, which has a well–documented habit of not enforcing (even dangerous) vehicle violations — resulting in about a 90% drop in citations since 2019.

Last week, I nearly got killed by a person running a (blatantly) red light. Ironically, it happened in front of where I live — literally one block from a police station. Thankfully, the person had a fairly memorable license plate so I went home and found the person had over two dozen parking citations in 2024 alone.

I decided to take a closer look at the data (helpfully provided by DataSF) and ended up going down a little rabbit hole. Come join me.

Fun fact: In the first five Fast and Furious movies, they only mention “family” or “families” 6 times. They are mentioned more than that in each of the next five movies for a total of 71 times.

June 22, 2023

Our JAMA IM paper on excess mortality among physicians in the US was just published

February 27, 2023

I’m on a National Academies’ committee about opioid and benzo prescribing in the VA — public comments welcome for the upcoming (open) sessions

February 1, 2023

My collaboration network: 2010 to 2022 version

January 20, 2023

There was a lot going on last year and I missed my annual tradition of updating my collaboration network at the end of the year. However, thanks to a perpetual illness ravaging the house, I’ve found some sleepless hours to fix my old code and pick up the tradition again.

As is always the case when I do this exercise, I can’t help but reflect back on 2022 (and 2021) with gratitude and appreciation for my amazing collaborators. It was my first year on the tenure-track and hours that normally would have been spent pushing research forward were instead spent getting my feet under me, taking on new students, hiring new postdocs, building up some community partnerships, shuffling through administrative tasks, etc. There’s a lot of things I enjoy about the job but by far the most enjoyable thing is working on things I think are interesting and important with people I like.

Plots of my biking in 2022

January 14, 2023

One of the best things about living in California is having amazing weather nearly all year long (the current 3-week-stretch-of-non-stop-rain aside). So last year, I decided to capitalize on the weather and made a New Year’s resolution to bike outdoors more. Specifically, I wanted to bike 1,500 miles outdoors in addition to my normal indoor biking of 2,500 miles. (Also, with a side quest of 100,000 feet of cumulative elevation gain.)

Below is a plot of my cumulative distance (and elevation) over the course of the year. I just barely got the distance resolutions with 1,551.7 miles outdoors and 2,506.1 indoors. I missed the elevation resolution by about 16,000 feet (ending at 83,899 feet).

It finally happened — I got COVID

December 7, 2022

Last September, I got COVID. It was wildly unpleasant with serious brain fog that lasted for several weeks even after the other symptoms went away. That said, this did give me the opportunity to make some more plots based on my own data. Below, I show a few metrics of my vital signs (respiratory rate, heart rate, heart rate variability, and body temperature deviation) relative to my exposure (vertical dotted line) for six weeks before and after. The thicker grey lines in the background are the pre- and post-exposure averages for those six weeks.

My slides from a guest lecture on data visualization

August 2, 2022

My slides from PAA 2022 on excess fatal drug poisonings in California

April 28, 2022

It’s official — I’m a tenure-track assistant professor

January 24, 2022

Our new paper suggesting smartphones are a good way to collective passive data from diverse groups with low levels of missingness

July 29, 2021

Our new systematic review of depression, anxiety, and suicidal thoughts among PhD students

July 23, 2021

My slides from a panel on wildfires, power outages, and vulnerability in California

June 2, 2021

Our preprint on why the situation in Yemen, currently the largest humanitarian crisis in the world, needs immediate attention before it becomes even worse

April 11, 2021

Our pre-print on making vaccination schedules more racially and ethnically equitable (feedback welcome)

April 2, 2021

Our paper looking at different COVID-19 testing/quarantine strategies for air travel

March 22, 2021

Our new The Lancet Infectious Diseases paper is out. We used microsimulation to evaluate different testing and quarantine strategies for air travel. The simulation incorporates day-specific test sensitivity, asymptomatic infections, and differential levels of adherence to self-quarantine. Given all the travel for holidays and spring break, we’re hoping it provides some useful insight for airlines and public health departments.

New Nature Comms paper on combining CDR data with (biased) dengue sequences to measure the relative impacts of immunity, environment, and human mobility

March 22, 2021

Our open-access paper looking at state cannabis laws and rates of assault and self-harm is up in JAMA Network Open with a great invited commentary

March 18, 2021

Our paper on how mobility network properties affect infectious disease dynamics in megacities in Epidemics

February 26, 2021

Our new paper in Epidemiology looking at trends and sociodemographic disparities in electricity-dependent durable medical equipment rentals

February 15, 2021

About 1 in 6 Medicare beneficiaries left behind opioid pills after death — it should be easy and simple for their families to safely dispose of old medications

January 29, 2021

Comparing epidemics

January 16, 2021

In all likelihood, the US will end up with more (direct) deaths from COVID-19 than the “opioid epidemic” since 1999.

Using the CDC WONDER data for opioid deaths and the NYTimes data for COVID-19, I show the cumulative deaths (y-axis) from all opioids (blue) and from COVID (red) over time (x-axis).

UPDATE: On March 2, 2021, the US has had more confirmed COVID-19 deaths than all opioid-related deaths 1999 to 2019. My back-of-the-envelope estimates suggests we will easily pass the 2020 opioid-related deaths by the end of April.

Our paper in Scientific Reports about incorporating mobility data and using ensemble models to predict dengue in Thailand

January 13, 2021

Student’s Tay Distribution

December 27, 2020

Taylor Swift has recorded 9 albums, each of them (except the most recent) has gone multi-platinum. In total, she has sold over 200 million records, won 10 Grammy’s, an Emmy, 32 AMA’s, and 23 Billboard Music Awards. Not bad for somebody who just turned 31.

This year, she’s managed to release two albums — they’re both very good. However, I noticed there seemed to be more profanity than I had remembered on her older albums. Here, I’ll use tidytext to see if she has actually increased her rate of profanity or if I’m simply misremembering things.

My collaboration network for 2010 to 2020 (+ other plots)

December 10, 2020

In what has become a bit of an annual tradition, here is my collaboration network for 2010 to 2020. This year was rough. Of the two first-author papers published this year, one was pre-pandemic. I think it’s fair to say this wasn’t the level of productivity I was expecting of myself. Hopefully, a few projects still in the pipeline will come out early next year.

All that said, I’m thankful for a strong network of kind collaborators who picked up my slack when necessary, checked in on me even when we didn’t have an active project, and understood when childcare issues caused last minute Zoom cancellations.

You’ll have plenty of time to work with famous, smart, and/or fun people — 2020 was a good reminder of the importance of working with kind people.

Comparing daily (direct) COVID-19 deaths to other causes of death

November 26, 2020

It’s easy to get numb at this stage of the pandemic, but a friendly reminder that daily COVID-19 (direct) deaths have been consistently higher than 8 of the top 10 causes of death (in 2018) since April.

We’re on track for over 3,000 deaths per day by Christmas (!!) — things are not good.

Power outages can have serious health consequences. Our new paper reviews the literature

November 12, 2020

My slides from a panel talk about some exciting work we have coming up about California wildfires

October 23, 2020

Applying an intro-level networks concept to deleting tweets

October 16, 2020

There are a few services out there that will delete your old tweets for you, but I wanted to delete tweets with a bit more control. For example, there are some tweets I need to keep up for whatever reason (e.g., I need it for verification) or a few jokes I’m proud of and don’t want to delete.

If you just want the R code to delete some tweets based on age and likes, here it is (noting that it is based on Chris Albon’s Python script). In this post, I go over a bit of code about what I thought was an interesting problem: given a list of tweets, how can we identify and group threads?

Our paper in Annals of Internal Medicine about counting COVID-19 deaths (with link to an informative accompanying editorial)

September 11, 2020

I wrote a simulation paper about playing Candy Land with toddlers

September 7, 2020

Our new paper on intersecting county characteristics (at different levels) that will likely impact an equitable COVID-19 response (and my first paper as the supervising author!)

September 2, 2020

Our paper about using phone-based location data for COVID-19 research is now up at The Lancet Digital Health

September 1, 2020

Things to consider before applying for a K99/R00

June 12, 2020

It officially looks like I’ll be awarded a K99/R00 (!!). The application process was a long, overwhelming slog — only possible with the generous support of mentors, colleagues, friends, and strangers.

Here, I will try to pay it forward by sharing some thoughts and advice. There are plenty of good blog posts about applying for K99’s, so I’ll try to avoid repeating those. Instead, I’m going to focus on things I didn’t know before and/or didn’t read elsewhere. It will be based on (1) insight from others who applied, (2) advice from mentors of successfully funded applicants, and (3) my interpretation from reading about 20 K-award summary statements and applications (both funded and unfunded).

If there’s enough interest in the topic, I might write about the writing process itself, but here I’m going to focus on things to do before you apply. The tl;dr is (1) consider non-K99 options, (2) apply early in your postdoc, (3) give yourself more time than you think you’ll need, (4) be strategic about your target institution, and (5) avoid easy critiques.

I scraped data of all San Francisco public elementary schools for parents figuring out the school lottery system

March 1, 2020

Our new paper about opioid prescribing patterns in the US

February 4, 2020

Some notes about a new (open access) paper with Keith Humphreys, Mark Cullen, and Sanjay Basu — “Opioid prescribing patterns among medical providers in the United States, 2003-17: retrospective, observational study” — just published in BMJ.

Collaboration network from 2010 to 2019

December 7, 2019

I have been trying to wrap my head around working with temporal networks — not just simple edge activation that changes over time but also evolving node attributes and nodes that may appear and disappear at random. What better way than to work with a small concrete example I’m already very familiar with?

Our overview of how stigma has affected the medical response to the overdose crisis (with a lot of amazing authors) in PLOS Medicine

November 29, 2019

Our paper on using Bayesian joint spatial models to decompose racial/ethnic inequalities (equations got ruined during proofing — corrections coming)

November 21, 2019

My SER 2019 slides on decomposing the Black-White life expectancy inequality due to deaths of despair

June 21, 2019

Our open-access paper looking at geographic variation in opioid mortality, by opioid type in JAMA Network Open. (Code and Shiny app are in the link.)

February 25, 2019

Quick look at NIH K-award funding

December 12, 2018

Motivated by a chat with Maria Glymour, I took a quick look at NIH K-award funding rates. It’s a very exploratory/descriptive look, but all the code is up on my GitHub. I’m hoping to find time to dive into the data more at some point.

Just putting it here, with no commentary, in case others who are applying for K’s might find it useful.

UPDATE: Since this post, I applied for, and received, a K99 — check out that blog for more up to date numbers and a new Shiny app.

Our paper on trends in pregnancy-associated mortality involving opioids in the United States, 2007–2016

December 6, 2018

Our (open-access) paper on trends in black and white opioid-related mortality is up now. (Code and data linked in the paper.)

August 5, 2018

My Collaboration Network

June 17, 2018

My Twitter timeline is blowing up with #NetSci2018 tweets and awesome visualizations this week, so I was inspired to see if I can quickly make my own “gratuitous collaboration graph” (as Dan would say).

Hover over each node to see the name of the paper (red), co-author (blue), or other project (green for data and orange for software).

Our NEJM paper “Mortality in Puerto Rico after Hurricane Maria”

June 15, 2018

Here is a non-exhaustive list of materials related to our new paper in New England Journal of Medicine, “Mortality in Puerto Rico after Hurricane Maria.” While I normally just link to my papers, this particular paper garnered a significant amount of attention and was viewed over 100,000 times. Not all the attention was good, accurate, or fair — but that is for another post.

Looking at opioid-related mortality, by race, 1979 to 2016

June 6, 2018

Our paper (with Monica Alexander and Magali Barbieri) is out now ~~(in published ahead-of-print form)~~. Monica has a great, short Twitter thread on the findings so if you don’t want to read the whole thing, check that out. The publisher’s PDF is here.

After submitting our paper, the NCHS released the 2016 multiple cause of death files — so a couple months ago we were curious to see how (or if) our results would change when adding the 2016 data.

[NOTE (2/25/2019): Since this post, the 2017 data have also been released. I did a similar analysis comparing our original paper with the additional two years of data. See this post here or our Github repo.]

I presented some joint work at PAA last week. Slides, code, and more here

May 4, 2018

tldr; San Diego weather is better than Boston weather

December 2, 2017

I am taking a break from a crazy couple months of writing and coding by… writing code. Just a quick post comparing weather in Boston (where I am) to weather in San Diego (where I’m from).

While the New York Times may have made the original, most data viz people will recognize the plot above from Tufte’s classic, Visual Display of Quantitative Information. It presents a ton of data in a clear, concise, and appealing way. The background bars show the record high and low daily temperature, the mid-ground bars show the “normal” (though as far as I can tell, normal is never clearly defined) high and low temperature, and the foreground shows the high and low for that year. In addition, we have annotations for days that met or made the record. The original plot even had a subplot for daily precipitation.

Here is a similar plot for Boston:

Replacing decimal points with interpuncts in MS Word

November 21, 2017

It turns out Microsoft Word’s “Advanced Find and Replace” is quite… well, advanced. You can actually use regex to do relatively complex find and replace functions. For example, The Lancet requires that all decimal points be middle dots (i.e., interpuncts). This is pretty trivial in LaTeX or Rmd and turns out it’s equally easy in Word.

Just use ([0-9]{1})(\.)([0-9]{1}) as your search query and \1·\3 as your replacement with the “Use wildcards” option.

We (as a field) should still be moving over to doing our drafting in Rmd or LaTeX though. The bloat on MS Word makes working with moderate sized manuscripts with figures painful.

I made an R package for working with NCHS multiple cause of death data.

November 13, 2017

Using R, Wikipedia, and SHERPA/RoMEO to show New England Journal of Medicine‘s pre-print statement is empirically false

October 8, 2017

One of the most fundamental aspects of collaborative research is sharing your work with others through pre-print or conference presentations. This isn’t likely to be news to anybody doing collaborative research these days, and many journals have become increasingly permissive with their pre-print policy. For example, Nature released an editorial making it clear, “Nature never wishes to stand in the way of communication between researchers.[…] Communication between researchers includes not only conferences but also preprint servers. The ArXiv preprint server is the medium of choice for (mainly) physicists and astronomers who wish to share drafts of their papers with their colleagues, and with anyone else with sufficient time and knowledge to navigate it. […] If scientists wish to display drafts of their research papers on an established preprint server before or during submission to Nature or any Nature journal, that’s fine by us.”¹ Other prestigious journals have similar policies—for example, The Lancet, Science, PNAS, and BMJ. (The list goes on and on.)

One such journal ~~does~~ did not. New England Journal of Medicine (Figure 1).

UPDATE: Since this post, NEJM has changed their position and pre-prints are allowed.

Show 1 footnote

Amazing visualization of different MCMC sampling algorithms

January 27, 2017

Using a histogram as a legend in choropleths

January 16, 2017

Despite well known drawbacks,¹ plotting parameters onto maps provides a convenient way of seeing context, patterns, and outliers. However, one of the many problems with choropleths is that the area of the regions tend to distort our perception of the value of the region. For example, in the United States, huge (in terms of land mass) counties will tend to have a greater visual impact than small counties (despite often having similar or even smaller population sizes).

One way to address this is to use a histogram as a legend on your map. The histogram then provides you with a way of showing raw counts of equal weights while the map allows you to provide the spatial context of the values.

Show 1 footnote

On graduate student burnout: “It isn’t usually a snap so much as a gradual disintegration.”

January 7, 2017

PSA: Applications are open for the 2017 Data Science for Social Good Summer Fellowship. 10/10 would do again.

December 9, 2016

Use bash to concatenate files in R

November 9, 2016

Often, I find I need to loop through directories full of csv files, sometimes tens of thousands of them, in order to combine them into a single analytical dataset I can use. When it’s only a few dozen, using fread(), read_csv, or the like can be fine, but nothing is quite as fast as using awk or cat.

Here’s a snippet of code that allows one to use bash in R to concatenate csv files in a directory. People in the lab have found it helpful so maybe others will as well.

A visual tour of my publications

October 8, 2016

I recently came across this paper by Michal Brzezinski about (the lack of) power laws in citation distributions. It made me a little curious about the citations of my own articles so I threw together a little script using James Keirstead’s Scholar package for R. In the plot above, every line represents a single article with time on the x-axis and (cumulative) number of citations on the y-axis.

It’s not super informative, so we can break it down a few ways to graphically explore the data.

Our unsurprising result: Among other things, mutual respect is important for implementing large-scale healthcare initiatives.

September 24, 2016

And associated Reddit Science AMA on our last essay…

December 21, 2015

Sadly relevant: Our essay on police killings and police deaths.

December 21, 2015

Reporter posts his mobile phone metadata for the public to analyze

August 17, 2015

The colon operator really is the fastest.

June 9, 2015

Way back when I was first learning R, I ran across an old listserv post that talked about how the colon (:) operator was the fastest way to generate a sequence. I never really thought about it, but I got in the habit of always using it whenever I needed a sequence.

Our “deaths due to ‘legal interventions'” essay. It’s not peer-reviewed (and it’s a couple months old), but increasingly relevant.

April 30, 2015

tl;dr: How you implement things — even simple things like checklists — is important.

April 6, 2015

New MetroCard rates and the dreaded “dead zone of change”

March 23, 2015

Looking at abortion laws and infant deaths

March 12, 2015

Boston’s brutal winter

March 11, 2015

Waterfalls of Eligible Singles

March 8, 2015

As a Valentine’s Day (gag) gift to one of my friends, I created a Shiny app¹ that will calculate the number of people in the United States who meet specified sex, age, marital status, race/ethnicity, educational attainment, employment status, and annual income requirements.

Show 1 footnote

tl;dr version: Discrimination is bad for everybody (especially those discriminated against).

June 25, 2014

Our new paper’s punchline: Health inequities need not rise as population health improves.

March 25, 2014

Shiny + deSolve = Interactive ODE Models

December 20, 2013

While taking a disease dynamics course, I thought it would be a good opportunity to learn how to use the Shiny package in R and create an interactive interface for some of my problem sets. After a few trial runs with smaller, simpler setups, I have wrapped up the side project (for now). You can see it in action here ¹ and you can view the final code on my Git.

Show 1 footnote

Finally, I am becoming stupider no more.

Paul Erdős