Summary for Non-Technical Readers
We wanted to create a fun and efficient way for users to discover their "type" by swiping through AI-generated profiles of people. This article walks you through the behind-the-scenes process of how we built the system. Here’s the simplified version:
- Profile Creation: We chose a few key appearance characteristics and used specific AI models to create profiles based on those traits.
- Data Tracking: As you swipe, we track which profiles you like or dislike based on the time you take to make decisions and patterns in your preferences (like consistently liking a particular hair color).
- Algorithm Ranking: The algorithm ranks characteristics like hair color or style based on how often and how quickly you positively evaluate them.
- Refining Your Preferences: Over time, the system starts showing you more profiles that match your preferred characteristics, while occasionally mixing in random profiles to ensure variety.
Now, for the tech enthusiasts, here’s a deeper dive into the development process and how it all works!
Getting the Data: Generating Profile Prompts
We started by defining the key characteristics we wanted to include in our dataset. To create the wide variety of profiles displayed on YourType, we generate a detailed prompt for each one. These prompts include essential features like:
- Hair color: blonde, brunette, black, red, etc.
- Hairstyle: curly, straight, long, short, etc.
- Other attributes: eye color, ethnicity, facial hair, accessories, etc.
The prompt generation tool takes these characteristics as input, which helps build the visual representations of the profiles. Our first step was deciding what attributes should be included in the profiles. Once we identified the important attributes (like hair color, hairstyle, etc.), we structured the prompt generation process to include these elements consistently.
Generating Profiles: Finding the Right Model
Creating realistic, appealing profile images was one of the critical steps. We used a model capable of generating detailed visual representations based on the attribute prompts. The challenge was optimizing the prompts to ensure that the generated images were visually distinct and reflective of the specific characteristics we wanted to highlight.
For example, finding the right balance between hair color, style, and overall look was key to making sure the profiles were representative and varied. After experimentation, we fine-tuned the model’s input prompts to achieve a wide range of diverse and appealing results.
Generating Metadata: Tracking and Analyzing User Interactions
Once the profiles were generated, we needed a way to track how users interact with them. For every profile shown to a user, the system collects metadata. This includes information about which characteristics were present in the profile and how the user responded to it.
Here’s how we handle this:
- Storing Characteristics: We extract the visual attributes from each profile (e.g., hair color = blonde, hairstyle = straight) and save them in a dictionary format. This dictionary is then stored in a .txt file, making it easy to retrieve and analyze.
- User Analysis Tools: We use this metadata to evaluate user preferences. For example, if a user consistently swipes positively on profiles with blonde hair, the system starts ranking "blonde hair" higher in their preference list. Over time, this helps the algorithm figure out which traits the user is most attracted to.
The Algorithm: Ranking Characteristics
Managing Attributes in Arrays
Each profile characteristic (such as "Hair Color" or "Hairstyle") is managed in its own array. For instance, the "Hair Color" array might look something like this:
HairColor = [blonde, brown, black, red]
Each characteristic has its own array, and we track how often each value is chosen or evaluated positively by the user.
Creating Key Elements for Easy Indexing
Each characteristic in the array is converted into a key element, which helps the algorithm index and rank the characteristics more efficiently. The key elements contain the following information:
- Attribute name: e.g., "HairColor"
- Current index: This shows the current ranking of the characteristic based on user interactions (e.g., how often a user swipes right on blonde hair).
- Streak counter: This counts how many times in a row a particular attribute (like blonde hair) was evaluated positively. This is important because consistent positive evaluations indicate a strong user preference.
Ranking the Key Elements
The next step involves ranking these key elements. To do this, the algorithm looks at several factors:
- Decision Speed: The time it takes for a user to swipe left or right on a profile gives the algorithm clues about how confident the user is in their decision. If the user quickly swipes right on profiles with specific characteristics, it’s a strong indicator that they like those traits.
- Streaks of Positive Evaluations: If a user positively evaluates (swipes right) several profiles in a row with the same attribute, like "blonde hair" or "curly hairstyle," the algorithm increases the ranking of that attribute. The longer the streak, the more confident the algorithm becomes in recognizing that as a key preference.
Refining Your Preferences: Combining Evaluation Data
After showing the user a fixed number of profiles, the algorithm makes a decision. It picks a combination of profiles that reflects the user’s highest-ranked characteristics, ensuring that the user sees more of what they like. However, to avoid overfitting (where the algorithm becomes too focused on one set of characteristics), we also include a few random profiles in the mix.
These random profiles serve two purposes:
- Preventing Bias: By introducing some random profiles, we avoid creating a feedback loop where the algorithm only shows the same types of profiles repeatedly.
- Error Checking: Random profiles help the system identify potential errors in pattern recognition. If a user suddenly prefers a random profile with an attribute they haven’t shown interest in before, the algorithm adapts to this new input.
How the Algorithm Identifies Your Type
Overall, the algorithm is designed to identify liked characteristics by letting them "bubble up" naturally through repeated positive interactions. As users swipe, the system becomes more confident in their preferences by analyzing patterns in both decision speed and streaks of positive evaluations.
By carefully balancing these factors, the algorithm learns what you’re most likely to be attracted to and starts showing you profiles that reflect those preferences, while still maintaining some variety to keep the experience fresh.
However, it’s important to note that this process is not always accurate. Several factors can contribute to inaccuracies in the algorithm's predictions:
- User Behavior Variability: Users may swipe differently based on their mood, context, or even the time of day. For example, someone might swipe right on a profile one day but left on a similar profile the next day due to a shift in preferences or emotional state. This variability can confuse the algorithm and lead to less reliable conclusions about what traits a user truly prefers.
- Limited Sample Size: The algorithm relies on the data it collects from user interactions. If a user has only swiped through a small number of profiles, the algorithm may not have enough information to accurately assess their preferences. A larger and more diverse set of interactions would provide more reliable data for the algorithm to analyze.
- Randomness in Introduced Profiles: While including random profiles is essential for variety and error checking, it can sometimes skew the perceived preferences of users. If a user accidentally swipes right on a profile that doesn’t align with their usual preferences, it may introduce noise into the data and lead the algorithm to misinterpret their tastes.
- Complexity of Preferences: Human attraction and preferences can be complex and multifaceted, often influenced by factors that go beyond the visible attributes the algorithm tracks. Traits like personality, interests, and values, which aren’t always captured in the generated profiles, play significant roles in attraction and compatibility.
- Data Indexing Issues: The characteristics of the profiles are not always reliably indexed, leading to mismatches between the attributes stored in the database and the real appearance of the images shown, which affects the overall reliability of the algorithm.
Conclusion
In summary, while the algorithm is designed to learn and adapt to your preferences over time, it may not always provide perfect recommendations. Users are encouraged to explore profiles with an open mind, as their preferences may evolve and change with new experiences and interactions.