Our Focus: Big Data and More

Technological advances have made vast quantities of data available for analysis. These data are often generated as a byproduct of technology use rather than a well-controlled scientific process. Many of the traditional tools and algorithms of data science are computationally insufficient for such large quantities of data.

The Center for Scalable Computing and Analysis supports RAND researchers—and their clients—by applying methodological and technical rigor to sometimes confounding questions. Here are some of our key activities, with some examples and links.

Analyzing Data and Computing Infrastructure

At the Center for Scalable Computing and Analysis, we examine data and computing infrastructure, the technology that generates and processes big data, both to their clients and to RAND. We work with other methodologists, such as those focusing on causal inference or mixed methods, to explore how their methods can be applied when the research involves large-scale data sets.

More on Data Science at RAND

Developing Algorithms

We also develop and use new algorithms to extract useful information from large, policy-relevant data sets, including geospatial images, natural language data in social media, and modeling and simulation outputs. Many of these sources of information would remain inscrutable for analysis without scalable computing and analysis.

  • Examining ISIS Support and Opposition on Twitter

    ISIS uses Twitter to inspire followers, recruit fighters, and spread its message. Its opponents use Twitter to denounce the group. To identify and characterize in detail both networks on Twitter, researchers use a mixed-methods analytic approach that draws on community detection algorithms to help detect interactive communities of Twitter users, lexical analysis that can identify key themes and content for large data sets, and social network analysis.

  • Using High-Performance Computing to Support Water Resource Planning

    Researchers from RAND and the Lawrence Livermore National Laboratory used high-performance computer simulations to stress-test several water management strategies over many plausible future scenarios in near real time.

More on Data Analysis at RAND

Understanding the Implications of Big Data

Finally, we study the implications of proliferating data and advanced algorithms for society. These include data privacy, equity, ethics, and even geopolitical concerns. We aim to develop tools and expertise that will allow us to better understand and address these challenges.

  • Thinking Machines Will Change Future Warfare

    Until now, deterrence has been about humans trying to dissuade other humans from doing something. But what if the thinking is done by AI and autonomous systems? A wargame explored what happens to deterrence when decisions can be made at machine speeds and when states can put fewer human lives at risk.

  • Addressing the Challenges of Algorithmic Equity

    Social institutions increasingly use algorithms for decisionmaking purposes. How do different perspectives on equity or fairness inform the use of algorithms in the context of auto insurance pricing, job recruitment, and criminal justice?

More on Big Data at RAND

Center Leadership