Open Data Challenge
Submissions accepted through January 31, 2025
Our goal in this challenge is to determine whether connectivity information alone is sufficient to align connectomes from two distinct datasets. Specifically, we examine the ventral nerve cords (VNCs) of male and female Drosophila, each comprising approximately 19,000 neurons. Successfully aligning these connectomes can reveal important insights into the similarities and differences between, for example, the male and female motor circuits.
We frame the alignment task as an approximate graph matching problem between two graphs that represent the respective connectomes.
Input and objective
You are given two weighted directed graphs GM = (VM, EM) and GF = (VF, EF), where:
- nodes V represent individual neurons
- edges E represent synaptic connections, with edge weights indicating the number of synapses between connected neurons
VM and VF are of same size, and your task is to produce a 1:1 mapping f:VM → VF with the objective of aligning the connectivity between the two graph as closely as possible.
Specifically, the mapping alignment score is defined as Σ min(EM(x, y), EF(f(x), f(y))),
where summation is over all pairs of nodes x, y in VM
(if an edge is not specified in the graph’s edge list, assume the edge weight is 0).
Your goal is to find a mapping f that maximizes the alignment score.
Prerequisites
This is a data analysis / optimization challenge, and it is open to everyone. Participants do not have to know anything about the Drosophila brain or connectomics. Data processing and some algorithmic skills should suffice to compete.
Data
- Male VNC Connectome - weighted directed graph (download here)
- Female VNC Connectome - weighted directed graph (download here)
- Benchmark Matching - 1:1 mapping of male/female graph nodes with a decent alignment score (download here)
male_edges = {(r[0], r[1]): int(r[2]) for r in read_csv("male_connectome_graph.csv.gz")[1:]}
female_edges = {(r[0], r[1]): int(r[2]) for r in read_csv("female_connectome_graph.csv.gz")[1:]}
matching = {r[0]: r[1] for r in read_csv("vnc_matching_submission_benchmark_5154247.csv.gz")[1:]}
alignment = 0
for male_nodes, edge_weight in male_edges.items():
female_nodes = (matching[male_nodes[0]], matching[male_nodes[1]])
alignment += min(edge_weight, female_edges.get(female_nodes, 0))
print(f"{alignment=}")
How to submit solutions
- Email your solution to arie@princeton.edu, including:
- The mapping f:VM → VF as a CSV file with 2 columns: Male Node ID, Female Node ID (same format as the provided benchmark matching)
- Your alignment score: Σ(min(EM(x, y), EF(f(x), f(y))) for all pairs of nodes x, y in VM
- Short explanation of how the result was achieved (method / main idea)
- Kindly introduce yourself and share how you came to learn about the challenge
- Mention if it is ok to have your name displayed on the leaderboard and how
- Submissions will be verified and leaderboard updated periodically, until Jan 31, 2025 end of day (ET).
- There is no limit on number of submissions per participant. Only the best one will be shown.
- Winners of the challenge will receive a special plaque from the FlyWire team at Princeton University, and will be invited to give a short presentation (optional).
- To not disadvantage early submissions, scores and solutions will be revealed after the challenge is complete
- Female and male VNC connectomes provided here are partial subsets extracted from BANC and MANC datasets respectively
- While the number of nodes is the same in both connectomes, the edges and their weights differ. Additionally, some nodes may have no edges attached. You are free to pair the nodes in any way, with the sole objective of maximizing the alignment score defined above.
- Much of the data used in this challenge is unpublished. Please do not publish or redistribute without first contacting flywire@princeton.edu
- Special thanks to Dr. Alexander Bates for contributing to the benchmark 1:1 mapping