Genetic Affairs AutoCluster – How does that work??
A How-to for Genetic Affairs AutoCluster
Hello fellow DNA enthusiasts and the AutoCluster curious.
It’s been a thrillingly wild ride since Evert-Jan Blom released Genetic Affairs AutoCluster yesterday morning. Response has been amazing and I’ve been able to tell him, “I told you so!” about a million times. Thanks to everyone who took the time to try it out and for the great on-going discussions.
For more information on the concept of match clustering, see Bettinger, Blaine T. “Clustering Shared Matches,” The Genetic Genealogist, 3 January 2017.
There is a Genetic Affairs manual available here but there seems to be a call for a more detailed, step-by-step instruction set for AutoCluster. This post is for that purpose. I love using DNA to solve mysteries and find new ones, but may never feel I’m an expert. If you see anything in the instructions that doesn’t ring right, please kindly correct me in the comments.
Open a PDF version –> AutoCluster How-To V2 12-3-18
Or follow along…
How to set up and run Genetic Affairs AutoCluster
Go to https://members.geneticaffairs.com/register and create an account.
Set up websites:
From the top navigation do Websites & Profiles>Add New Website
Choose the appropriate testing company
Enter your login and kit information
(Note: If you are not comfortable providing this information for AncestryDNA, you could create a new account with a different username and password, then share your DNA with this new account, and provide that login to Genetic Affairs. How? Create the new account by going to Ancestry’s login screen, log into your regular account, for each kit, go to Settings>DNA Ethnicity and Match Access>add a person. Fill out the form (email address tends to work better than username) Keep the Role set to Viewer, then Send Invitation. Be sure to confirm the AncestryDNA email invitation for each kit you do this for.)
When the credentials are confirmed, click Close
Set up Profiles
From the Websites page, click the profiles icon
Verify that Genetic Affairs automatically pulled in all of the DNA kits associated with the login. You can delete the ones you don’t want in the system.
To add new profiles to the AncestryDNA website definition, go to Websites & Profiles on top bar, under Available websites, click on AncestryDNA – Username. At the bottom of the subsequent page, click Retrieve/Update New Profiles for this Website. The new profiles should be pulled in. The same steps are used for 23andMe.
To add an FTNDA profile, from the Websites & Profiles tab on top bar, click Add Website, then Add FamilyTreeDNA Account and proceed with login info.
On the profiles page, for the profile you want, click the AutoCluster Icon
There are three ways to specify which matches to include in the AutoCluster:
- Approach A- this one allows you to control the contents of the cluster using cMs only. Set the range per your needs.
- Approach B- here you are using the testing companies predicted relationships as inclusion parameters but keep in mind these predictions are just that, predictions.
- Approach C- this one is a hybrid of the first two and will apply both a cM range and a predicted relationship range, if the match falls in to either one, it will be included
Note: AutoClusters will only pull in those matches who also have shared matches or ICW with you. So some matches, even though they fall within the user specified parameters, will not be included in the clusters chart.
Choose the best option for your needs (I like Approach A and tend to start with the defaults), adjust the parameters to suit and click Perform Analysis and confirm the Are You Sure? pop-up.
If the login credentials are correct, the server is able to send the request to the test company, and the test company accepts it, a Success! message appears.
Go back to your Profiles and change the “Update Interval” to “never” unless you want to receive periodic updates on new matches. To change this for all profiles, use the option at the bottom of the Profiles page.
You will receive the chart by email in HTML format.
The email will also contain two .csv files, one will contain all of your matches from the test site and the other is the basis for the chart.
If the email does NOT have the HTML attachment, it is likely your parameter settings or your particular match list do not provide enough matches to build a chart. Or, your range is so wide that the connection timed out prior to the collection of all data. A message to this effect should be found in the body of the email message. If a match .csv is attached, you can use it to determine better AutoCluster parameters.
Save the HTML file to your hard drive before opening it.
The AutoCluster file is named “AutoClust_company-name_profile-name_date.html”. If the current limit of one per day is lifted and you run more than one in the same day, you’ll want to rename them as you save.
The AutoCluster chart format was recently improved thanks to Graham Hart. The legend now resides outside the of the grid and hovering on a cluster name in the legend highlights that cluster and lists the user names of the matches it contains. Graham has also created a program that will your older AutoCluster HTML files into the improved format. Information on that nifty tool is HERE.
Interpreting your AutoCluster chart
- AutoCluster merely organizes your shared matches into logical groups. It does not answer any genealogical questions on its own.
- AutoCluster charts will be as various and as colorful as our match lists are.
- AutoCluster charts may contain any number of clusters.
- By adjusting the parameters of your AutoCluster run, you can somewhat influence the number of clusters formed, but the underlying algorithm has the final say.
- There are no magic settings to force a chart to contain a certain number of clusters. (If you find them, please share!)
- The chart has ten colors available to it. Therefore, colors will be repeated in charts containing more than ten clusters.
- Each colored block or cluster represents a group of people who match you and most of the others in the cluster.
- Each cluster likely represents a branch or line of your family but no degree or characteristics of any relationship are inferred by the cluster’s existence, color, size, or placement.
- There may be more than one cluster that represents any branch.
- There may be branches that are not represented in the chart at all.
- Each of the colored cells represents an intersection between two of your DNA matches, meaning, they both match you and each other.
- A cluster may contain one or more generations.
- The members of a cluster may share DNA, but do not necessarily share the same segment.
- The members of a cluster are likely related to one another in some way, but each cluster may represent more than one MRCA
- This is because a cluster may contain cousins of differing degrees, say a 2C and a 3C. The 2C and you will share a 1GG MRCA couple but the 3C and you will share a 2GG MRCA couple.
- Each person in the chart can only be in one cluster.
- Gray colored cells may be present that are not part of any color-coded cluster. These are intersections of two cousins, where one of them is too closely related to you to belong to just one cluster. Each of these cousins are in separate color-coded clusters, the gray cell indicates that one or both of them also belongs in both of these clusters.
- Use the clusters to support previous research, as a road map to help with phasing, surname identification, and pedigree triangulation, and to discover unrealized connections and patterns between your matches.
- For more information on the concept of match clustering, see Bettinger, Blaine T. “Clustering Shared Matches,” The Genetic Genealogist, 3 January 2017.
- To join our awesome user group to be up to date on the latest features, visit us on Facebook.
We’d love to see what you do with AutoCluster. Post your chart and discoveries on your favorite genealogy Facebook page, on our users group page, or blog about it.
Version 3.0 – January 1, 2019 – Teresa Kahle