Visualizing The Invisibles

Medium
Data Visualization, Creative Coding

Methodology
MySQL, Processing, Boids Simulation

Exhibition
2023 | latent•ville: MAT End of Year Show, UCSB, Santa Barbara

Visualizing The Invisibles is a data visualization strategy to transform co-occurrency patterns in datasets of library check-ins/check-outs into self-organizing agent-based behavioral patterns. The project investigates on the public cognition of “architecture” by data visualization and design based on data query/analysis/processing of book subjects that were borrowed along with book titles with “architecture”. The data visualization focuses on creating an artificial life simulation, where Boids as a form of emergent behavior introduces complexity, unpredictability and interactivity.

Data Mining

Data Source: 110 million datasets of library check-outs/check-ins, Seattle Public Library source. In each check-outs/check-ins record, SPL metadata contains book title, deweyClass, subject, check in timestamp, check out timestamp.

Based on the assumption that books that were borrowed together may suggest corelated cognition of each other’s subject, the data query pairs book titles with the keyword “architecture”, with books that doesn’t have the keyword “architecture” but were checked in and out at the same time. 

co-occurrency record

From the queried data, each subject title appeared in the dataset can be processed with following data records in order to be reffered in the agent-based simulation:
1. Dewey classes of itself
2. Subjects that co-occurred with it within the same book title
3. Frequencies of their co-occurrency
4. Dewey classes of the co-occurred books
5. Subjects of co-occurred books
6. Frequencies of their co-occurrency

agent-based datasets
Visualization

The agent-based simulation does not intend to deliver a result right at the beginning. Instead, it requires interaction with provided parameters and observation through its self-organized forms. Different parameter setups could lead to different results.

The Boids system contains two components:

 

I. static points, which represent Dewey classes

    • Position
      first three digits of Dewey classes(abc.def)
      a -> point.x, b -> point.y, c -> point.z
    • Scale
      determined by number of subjects that belongs to the dewey class
    • Attraction force
      create attraction force for agents that belongs to this dewey class

II. Swarm agents that represent subjects

These points are initiated with random locations and random flying directions. When meeting another agent or a Dewey class point within its search distance, it will check whether they are co-currently or directly related and how strong the connections are, then calculate the vector of its next movement based on flocking principles of alignment, cohesion, and separation

  • Separation force
    steer away from agents that don’t have any relation with itself
  • Alignment force
    align with the velocity vector of agents that have co-occurrent relation with itself
  • Cohesion force
    cohere to the flocking center of all related agents nearby
  • Random force
    random velocity vector to the next movement
  • Path
    a path line formed by previous position points, color ranges from black to red, the closer the agent gets to Dewey class points it belongs, the redder it becomes, older position points will fade out.
  • Connection to Dewey class
    connection line from agent to Dewey class points it belongs, color ranges from black to red, the closer the agent gets to Dewey class points it belongs, the redder it becomes.
  • Path
    a path line formed by previous position points, color ranges from black to red, the closer the agent get to dewey class points it belongs, the redfer it become, older position points will fade out
  • Connection to co-occurrent related agents
    connection line from the agent to surrounding agents that are co-occurrent related, its color is averaged from the color of connection points on both ends.

The agent-based simulation does not intend to deliver a result right at the beginning. Instead, it requires interaction with provided parameters and observation through its self-organized forms. Different parameter setups could lead to different results.

start
end