December 22, 2020
Plowing enormous fields of patient records to harvest useful data, Atul Butte, M.D., Ph.D., informatics expert, professor and founder of three data-driven start-up companies, shares how his work harnesses information to shape better patient care strategies.

“We have a responsibility to use these records to improve the practice of medicine,” says Butte, chief data scientist for University of California Health and director of the Bakar Computational Health Sciences Institute (BCHSI) at the University of California, San Francisco (UCSF, San Francisco, CA, USA), a cornerstone of the university’s efforts to manage big data and a core element of UCSF’s campus wide efforts in precision medicine.

“The central data warehouse that we’ve developed offers us a better idea of what’s working and what’s not in medicine,” explains Butte. “Our own healthcare system has already saved millions of dollars when we searched the records and noticed that thousands of our patients were prescribed brand name instead of generic drugs. Continuing research might reveal other helpful information – drugs and devices that don’t change outcomes or practices that are outdated or unnecessarily expensive. The data found in the medical record is not perfect, and it can be hard to get to, but it is the record of what we’re doing to and measuring from patients, and it’s beneficial to how we treat them in the future.”

Butte has straddled the worlds of computer science and medicine since his undergraduate and medical school years at Brown University (Providence, RI, USA). During his 25-year career, he frequently has forged paths, built programs or even launched companies when noticing unmet needs. No field of information is too large and no lack of code to analyze it is too daunting. Now in his sixth year at UCSF, Butte’s team has compiled the clinical records from all six of the UC academic health campuses — including 20 health professional schools, six medical centers, 12 hospitals and more than 1,000 care delivery sites — into one database.


“Nowhere else in the United States do six large academic health centers share data in a manner like we do in the University of California,” Butte states. “Like many universities, we have multiple strengths and figuring out how to pull together and channel resources for the common good is a challenge. When it works, it’s incredible and I do believe nothing can stop us. I hope to show some patterns we’ve learned and highlight some of our achievements for the SLAS2021 Digital International Conference and Exhibition.”

Butte is excited about his keynote presentation for the global, virtual event, to be held January 25-27, 2021, which will feature dynamic scientific sessions, new product announcements, poster presentations and virtually enhanced networking and partnering opportunities.

The potential within the SLAS2021 audience impresses Butte, and he encourages Society members to learn as much as possible about computer learning and informatics. “It’s incredible how much data we all have,” he comments. “Scientists, engineers and researchers need to expand their technical knowledge, minimum data skills and even learn basic programming skills, whether it’s the computer language Python or a data science language, such as R. Those will put people in a much better position to navigate and understand their own data. We’re teaching third graders how to write programs all around the country. It’s about time all scientists learned how to code.”

He anticipates that the message of how medical devices and tools impact patients will inspire the SLAS members. “I want to show the informatics side of the devices and tools that have been engineered and used in medicine,” he explains. “I think that’s why I was invited to speak before SLAS — to share the power and potential of data.”

Harvesting the Health Record for Better Medicine

Now that Butte’s analysis of the UC Health records is yielding valuable information on prescription trends, Butte hopes to explore other aspects of pharmaceuticals within the records of the six campuses. “We’re looking at inappropriate drugs being used unnecessarily for in-patient procedures — pre- and post-surgical. We’re exploring newer areas of biosimilars. Do we have to use an expensive biologic or antibody, or could we use a less expensive biosimilar product when it’s available?”

Photo credit: Elisabeth Fall

He also plans to delve into the spectrum of cancer genetics conducted in the UC system. “We have one database now with nearly all the cancer genetics data across the entire University of California system, tens of thousands of patients,” he explains. “That means in the near future, when an oncologist at one of our campuses see an unusual mutation, they can then search the database to see what patients in the system had the mutation, who treated them and what treatment was used. We have the capability to make connections across our huge university in that way.”

Butte is skilled at navigating enormous data fields. His team has often used a systems-based approach to make drug discovery for cancer and other diseases complementary to the traditional target-based approach, revealing that the method is sometimes more efficient by identifying drugs that have been purposed for other disease. For the study, the researchers analyzed more than 66,000 compound gene expression profiles from the Library of Integrated Network-based Cellular Signatures (LINCS), more than 12 million compound activity measurements from the database ChEMBL19, over 1,000 cancer cell line molecular profiles from the Cancer Cell Line Encyclopedia and more than 7,500 cancer patient samples from The Cancer Genome Atlas. With their computational method, Butte’s team identified four drug candidates to potentially treat hepatocellular carcinoma, a type of liver cancer for which no effective therapy exists.

The ongoing pandemic is another repository of data to be explored, says Butte. In addition to collecting all the COVID-19 data and organizing it daily for UC Health, his lab was one of the first to offer county-by-county data on COVID-19 cases. In addition, publications such as AARP magazine and Reader’s Digest (UK) called him for interviews regarding his team’s 2018 review of 242 immunity studies that reveal patterns in how immune systems change with age and how this information is helpful in the fight against COVID-19.


“If you don’t have a family member who has had SARS-CoV2, it’s hard to connect with what’s going on," explains Butte, adding that UC Health tested approximately nearly 400 thousand patients across the six campuses since cases began to emerge at the beginning of 2020. “The public needs to realize that this is not an invisible disease. We want to remind people that it is real, and we are all struggling with COVID. We’re transparent with information so that the public can be better informed about what they should do.”

Exploring Science

Butte’s fascination with computers and medicine was largely shaped by his parents during his childhood in Cherry Hill, NJ, USA. His father taught biology at the local community college, and when he was in middle school, his mother decided to pursue a degree in the newly launched field of computer programming. Butte happily learned the skill alongside her. Writing code in a notebook, Butte eagerly awaited family trips to the local shopping mall in order to test his work on the Apple II Plus personal computers sold there. After several of these trips, Butte's parents decided to invest in a computer of their own.

The hobby grew into a degree program during his eight years at Brown University. Studying computer science and medicine during the school year, Butte spent summers at Apple and Microsoft mastering his code writing skills. Between his third and fourth years in medical school, he took a transformative one-year internship with the Howard Hughes Medical Institute (HHMI), National Institutes of Health (NIH). He had never set foot in a biology lab and thought the experience would be valuable. He was right — captivated by both insulin receptors and pediatric endocrinology, Butte went on to practice and research in pediatrics and informatics at Boston Children’s Hospital.

Wanting to further develop his expertise, Butte then pursued a dual program offered through the Massachusetts Institute of Technology (MIT, Cambridge, MA, USA) and Harvard Medical School (Boston, MA, USA), from which he earned a Ph.D. in health sciences and technology. He reports that eventually time spent on patient care yielded to finding new ways to help more patients through medical informatics. "I miss seeing patients. It was a hard decision, but I truly believe I am helping more patients [through informatics] than seeing them one by one,” he commented in a UCSF interview.

In his journey to UCSF, Butte also became a translational expert and encourages others in academia to engage in similar pursuits. “I advise academics to innovate beyond their university. Learn how to file intellectual property and patents, if necessary,” he explains. “If no one licenses a patent, think of starting your own company. This approach has worked well for me and a number of my graduate students.” Butte’s lab discoveries entered the marketplace through the three Bay Area biotech start-up companies that he founded: Personalis (IPO, 2019), which provides medical genome sequencing services; Carmenta (acquired by Progenity in 2015), which discovers diagnostics for pregnancy complications; and NuMedii, which finds new uses for drugs through open molecular data.

“What we discover in academic labs doesn’t have to end in academia. The science can continue to develop through a start-up company and then scale up,” says Butte, who has been continually funded by the NIH for 22 years, is an inventor on 24 patents and an author of more than 200 publications, with research repeatedly featured in the New York Times, Wall Street Journal and Wired Magazine.

“I think that there are plenty of amazing, brilliant scientists at universities around the world who could take their inventions further, but are really stymied by The Valley of Death,” comments Butte, who carries the title of Priscilla Chan and Mark Zuckerberg Distinguished Professor at UCSF. He adds that innovation will require investing in early career scientists and discoveries.

“With the costs of development skyrocketing, and some who argue that the low hanging fruit in drug development has been picked, it’s getting harder and harder to develop new diagnostics to make progress in life science,” says Butte, adding that the role of computational artificial intelligence is going to make a difference, but not a striking difference.

“AI doesn’t solve all the world’s problems. Innovation is challenging. The answer is to reach out around the world to find those early-stage discoveries and to talk to the scientists in academia who have an amazing idea, but do not have the resources to take it forward. That is how future development is going to happen,” says Butte.

Butte encourages SLAS’s student members to seek out mentors to guide them through early career challenges. “The importance of a mentor never ends in one’s career, and it’s valuable to give back to the process,” says Butte, who is still close friends with his Harvard Medical School mentor, Isaac Kohane, M.D., Ph.D.

“Mentorship is desperately needed, especially as people make challenging early career choices. When you’re not getting the grants or start-up funding that you need to succeed, sometimes the most important lesson you can learn is patience. Observe your mentor’s career. Notice how long it took your mentor to get launched. Knowing such things gives you patience to keep moving forward and trying new angles. We can overcome any obstacles by looking to see what others have done before.” Editor’s Note: Consider participation in SLAS Mentor Match.

He urges all life sciences discovery and technology professionals at every stage in their career to strive for continuing education. “This is a world of lifelong learning and there’s no reason for education to end. The opportunities are there, seek them out through professional organizations such as SLAS and universities. Stay as young as you can in your thinking and vision,” he explains. “We should always be trying to change the world. Those with energy and enthusiasm should continually try to reinvent what they are doing — the world needs it.”