Contents

  1. Abstract
  2. The Problem with Personality Questionnaires
  3. Mirror's Approach: Behavioral Signals
  4. The Big Five (OCEAN) Model
  5. Feature Extraction Pipeline
  6. Mapping Features to Traits
  7. Confidence & Maturity Model
  8. Early Signals System
  9. Supported Personality Frameworks
  10. Data Sources & Connectivity
  11. Privacy Architecture
  12. Security & Data Protection
  13. Limitations & Ethical Considerations
  14. References

1. Abstract

MVAT Mirror is a mobile application that builds personality profiles from communication patterns without requiring users to answer a single questionnaire question. By analyzing structural properties of writing — sentence length, question frequency, vocabulary diversity, response timing, and other statistical features — Mirror produces Big Five (OCEAN) personality scores with quantified confidence intervals.

All analysis runs on-device. Raw message content is processed in-memory, converted into numerical feature vectors, and immediately discarded. Only derived personality scores are stored or synced. This architecture makes it structurally impossible for Mirror to read, store, or transmit the content of user communications.

This paper describes the scientific foundations underpinning Mirror's analysis, the feature extraction pipeline, the confidence scoring model, the privacy architecture, and the supported personality frameworks.

2. The Problem with Personality Questionnaires

Traditional personality assessment relies on self-report questionnaires — instruments like the NEO-PI-R, the BFI-44, or the MBTI Form M. While these instruments have significant empirical backing, they share structural limitations that reduce their practical utility for most people:

Computer-based behavioral analysis addresses each of these limitations. Writing style is difficult to consciously manipulate, is measured against absolute scales rather than subjective reference groups, can be aggregated across hundreds of samples to reduce noise, imposes zero burden on the user, and naturally tracks longitudinal patterns.

Key finding: Youyou, Kosinski, and Stillwell (2015) demonstrated that computer models predicted personality from digital behavior more accurately than most human judges. With 300+ behavioral signals, the computer model achieved r = 0.56 against self-reports — surpassing the accuracy of coworkers (r = 0.27), friends (r = 0.45), and family members (r = 0.49). Only spouses performed comparably (r = 0.58).

3. Mirror's Approach: Behavioral Signals

Mirror takes a fundamentally different approach to personality assessment. Rather than asking users to describe themselves, Mirror observes how they naturally communicate — and derives personality indicators from the structural patterns in that communication.

Communication
sources
In-memory
text processing
Feature vector
extraction
Trait score
computation
Personality
profile

The key distinction is between content and style. Mirror never interprets what a message says — it measures how the message is structured. Consider two messages:

Mirror does not evaluate the topic being discussed. Instead, it extracts structural features: message 1 has 32 words, 3 questions, 2 hedging markers ("maybe", "I'm not sure"), high first-person pronoun usage, and a collaborative closing. Message 2 has 3 words, 0 questions, 0 hedging markers, and a directive structure. These structural differences correlate with measurable personality trait differences in Agreeableness, Openness, and Extraversion.

Why Style Beats Content

The insight that function words (pronouns, prepositions, articles) reveal more about personality than content words (nouns, verbs) was established by James Pennebaker's pioneering work at the University of Texas at Austin. His Linguistic Inquiry and Word Count (LIWC) framework demonstrated that the small, easily-overlooked words people use — "I" vs. "we", "but" vs. "and", "maybe" vs. "definitely" — are more predictive of personality traits than topic choice or vocabulary sophistication (Pennebaker, 2011).

Mirror builds on this research with a feature extraction approach inspired by LIWC's lexicon-based methodology, adapted for mobile communication patterns including email, SMS, and messaging metadata.

4. The Big Five (OCEAN) Model

Mirror's primary personality framework is the Big Five model, the most empirically validated framework in personality psychology. Unlike categorical systems (e.g., MBTI's 16 types), the Big Five describes personality as positions on five continuous spectra. This continuous measurement allows for nuanced, reproducible assessment and enables statistical confidence intervals.

Openness to Experience
Curiosity, imagination, aesthetic sensitivity
Correlated signals: Vocabulary diversity, question frequency, use of abstract language, varied sentence structures, exploration of tangential topics
Conscientiousness
Organization, dependability, self-discipline
Correlated signals: Response timing consistency, message completeness, punctuation correctness, structured formatting, follow-through in threaded conversations
Extraversion
Sociability, assertiveness, positive affect
Correlated signals: Message length and volume, exclamation frequency, emoji usage, positive emotion words, conversation initiation rate
Agreeableness
Warmth, cooperation, empathy
Correlated signals: Social and cooperative words, question-to-statement ratio, hedging language, first-person plural pronouns ("we"), collaborative phrasing
Emotional Stability
Resilience, composure, stress management
Correlated signals: Variance in message timing, negative emotion word frequency, sentiment consistency across messages, cognitive complexity markers

Display convention: Mirror displays the fifth dimension as "Emotional Stability" rather than "Neuroticism." The underlying score is computed as neuroticism and inverted for display, so a high Emotional Stability score corresponds to low neuroticism. This reframing follows positive psychology conventions and avoids pathologizing language.

Each trait is scored on a continuous scale from 0 to 100, where 50 represents the population average. Scores are accompanied by confidence intervals that narrow as more communication data is analyzed. A score of 72 in Openness with a confidence interval of [65, 79] means Mirror estimates the user's Openness is above average, and is 95% confident the true value falls within that range.

5. Feature Extraction Pipeline

Mirror's analysis pipeline extracts 12 numerical features from each communication event. These features are the only output of raw text processing — once extracted, the source text is discarded from memory. The feature vector is the atomic unit of Mirror's analysis.

Feature Type Description
wordCount Integer Total number of words in the message
questionCount Integer Number of question marks or interrogative constructions
exclamationCount Integer Number of exclamation marks
firstPersonPronounRatio Float [0, 1] Proportion of words that are first-person pronouns (I, me, my, mine)
positiveEmotionWords Integer Count of words from positive emotion lexicon (joy, love, great, excited)
negativeEmotionWords Integer Count of words from negative emotion lexicon (hate, angry, sad, worried)
socialWords Integer Count of social and cooperative terms (we, together, team, help)
cognitiveComplexityWords Integer Count of causal and reasoning markers (because, therefore, however, although)
avgSentenceLength Float Mean number of words per sentence
vocabularyDiversity Float [0, 1] Type-token ratio: unique words divided by total words
emojiCount Integer Number of emoji characters used
isReply Boolean Whether the message is a response to another message (contextual estimate)

Lexicon-Based Feature Detection

Four of the twelve features — positiveEmotionWords, negativeEmotionWords, socialWords, and cognitiveComplexityWords — rely on curated word lists (lexicons). Mirror's approach is inspired by the LIWC (Linguistic Inquiry and Word Count) framework developed by Pennebaker and colleagues (Tausczik & Pennebaker, 2010). Each lexicon category contains carefully selected marker words and their common variations.

The lexicon approach has a critical advantage for privacy: it requires only word-level matching against predefined lists, not sentence-level comprehension. Mirror counts how many words from each category appear — it does not parse grammar, resolve references, or interpret meaning. The lexicons are static assets bundled with the application; no network lookup is required.

Feature Aggregation

Individual message features are aggregated into running statistical distributions. Mirror tracks the mean, variance, and trend (slope over time) for each numerical feature across all analyzed messages. This aggregation serves two purposes: it smooths out noise from individual messages, and it enables trend detection — personality expression is not static, and Mirror can observe shifts in communication patterns over time.

6. Mapping Features to Traits

Mirror maps extracted features to Big Five trait scores using weighted scoring functions derived from published correlation matrices. Each trait draws signal from multiple features, and each feature may contribute to multiple traits. The mapping reflects empirical findings from computational linguistics research.

Feature-Trait Correlation Matrix

The following summarizes the primary directional relationships between features and Big Five dimensions, based on research by Yarkoni (2010), Schwartz et al. (2013), and Pennebaker (2011):

Scoring Mechanism

For each Big Five dimension, Mirror computes a weighted sum of normalized feature values. Features are normalized against population baselines derived from published research on email and messaging corpora. The weighted sum is passed through a sigmoid function to produce a score between 0 and 100, centered at 50 (the population mean).

Weights are not learned from Mirror user data — they are derived from published correlation coefficients in the personality-language literature. This design choice means Mirror does not require a training dataset of users who have taken both questionnaires and had their writing analyzed. It also means Mirror never builds behavioral models from user data that could be subject to data mining concerns.

7. Confidence & Maturity Model

Personality estimation accuracy improves with the volume of analyzed communication. Mirror quantifies this through two complementary systems: per-trait confidence intervals and an overall profile maturity indicator.

Per-Trait Confidence

Each Big Five trait score is accompanied by a confidence value between 0.0 and 1.0, representing Mirror's certainty in the estimate. Confidence is computed from the number of events analyzed for that trait and the consistency (low variance) of the underlying feature distributions. A trait with high event count but high feature variance will have lower confidence than a trait with fewer events but highly consistent signals.

Profile Maturity

The overall profile maturity reflects the total number of communication events processed across all connected data sources. This provides users with a simple, intuitive sense of how complete their profile is.

Maturing
0 – 49
events processed. Profile is forming. Early signals may appear.
Building
50 – 499
events processed. Trait estimates emerging. Confidence intervals wide.
Confident
500+
events processed. Reliable profile. Scores stabilized and intervals narrow.

Mirror also provides an estimated time-to-confident calculation based on the user's current data accumulation rate. If a user is generating 15 events per day and has processed 200, Mirror will display "~20 more days of your normal messaging" as a progress indicator.

8. Early Signals System

Before a full personality profile forms, users want meaningful feedback. Mirror's early signals system surfaces notable patterns as soon as 50 communication events have been processed — well before the 500-event threshold for a confident profile.

Early signals are observations about communication patterns that hint at personality traits without committing to a full score. Each early signal includes:

Early signals are generated by detecting statistical outliers in feature distributions relative to population baselines. If a user's question frequency is consistently above the 75th percentile after 50 messages, Mirror surfaces this as an early signal for Openness — without assigning a numerical score.

Design principle: Early signals use qualitative language ("more than average", "notably consistent") rather than numerical scores. This prevents users from anchoring on premature estimates that may shift significantly as more data arrives.

9. Supported Personality Frameworks

While the Big Five model serves as Mirror's foundation, the app supports seven personality frameworks total. Six additional frameworks are available to Pro subscribers. All derived frameworks are computed from the Big Five scores using published mapping functions — they do not require separate data analysis.

Big Five (OCEAN)
FREE
Core personality dimensions on continuous scales. The foundation for all other framework mappings.
Enneagram
PRO
Nine interconnected personality types revealing core motivations and growth paths.
MBTI Analysis
PRO
16 cognitive preference types mapped from Big Five scores without the unreliable questionnaire.
DISC Assessment
PRO
Workplace behavior and communication style profiling: Dominance, Influence, Steadiness, Conscientiousness.
Attachment Style
PRO
Relationship patterns and emotional bonding style: secure, anxious, avoidant, disorganized.
Love Languages
PRO
How you express and receive affection: words, acts, gifts, time, touch.
Values in Action
PRO
Character strengths and virtues classification based on positive psychology research.

Cross-Framework Derivation

The Big Five model serves as a universal foundation because it has well-documented statistical relationships with most other personality frameworks. Research has established reliable mapping functions between Big Five scores and MBTI preferences (McCrae & Costa, 1989), Enneagram types (Dris & Noftle, 2011), DISC dimensions, and attachment style classifications. Mirror uses these published mapping functions to derive secondary framework scores from the primary Big Five analysis, rather than performing separate language analysis for each framework.

10. Data Sources & Connectivity

Mirror connects to three categories of communication data. Each source undergoes the same feature extraction pipeline. Free accounts can connect up to 3 sources; Pro accounts have no limit.

Gmail (OAuth 2.0)

Mirror requests read access to sent mail only — it does not access received emails, drafts, or other mailbox contents. Each sent email is processed in-memory to extract a feature vector. The raw email body is never written to disk, stored in a database, or transmitted to any server. OAuth tokens are stored securely on-device and can be revoked at any time through Mirror's settings.

SMS / Messages

On Android, Mirror accesses SMS content with user permission to extract writing pattern features. On iOS, due to platform privacy restrictions, Mirror cannot access iMessage or SMS content directly. On iOS, Mirror falls back to contact frequency and timing metadata analysis only, which provides Conscientiousness and Extraversion signals but limited Openness and Agreeableness data.

Call Logs

Mirror analyzes call duration and frequency patterns — it does not record or analyze call audio. Call log data contributes primarily to Extraversion signals (call frequency, average duration, breadth of contacts) and Conscientiousness signals (consistency of calling patterns, time-of-day regularity).

Historical Import

Users can optionally import historical communications to accelerate profile maturity. Historical import requires explicit opt-in and processes messages in batches with a progress indicator. This is the fastest path to a confident profile — a user with 500+ sent emails in their Gmail account can reach confident profile status immediately after the initial import completes.

11. Privacy Architecture

Privacy in Mirror is not a policy decision — it is a structural property of the system architecture. The service interfaces are designed so that raw text physically cannot be transmitted beyond the feature extraction boundary.

The Privacy Boundary

Mirror's code architecture enforces a strict boundary between text processing and personality analysis. The data ingestion service processes raw text and outputs only feature vectors (the 12 numerical features described in Section 5). The personality analysis service accepts only feature vectors as input — it has no API for receiving raw text. This structural API design means that even a code bug or misconfiguration cannot cause raw text to reach the cloud sync layer, because the types do not permit it.

1
Raw text enters memory
Communication content is loaded into device memory for processing. It is never written to disk.
2
Feature extraction
12 numerical features are computed from the text. Word lists are matched against static lexicons. No semantic analysis occurs.
3
Text discarded
The raw text is released from memory. Only the numerical feature vector (a few hundred bytes) remains.
4
Trait scoring
Feature vectors are aggregated and mapped to Big Five scores with confidence intervals. All computation is on-device.
5
Personality vectors stored
Only the final personality scores, confidence values, and event counts are persisted — locally and optionally to cloud backup.

What Is Never Stored or Transmitted

What Is Stored

Data Export and Deletion

Users can export their complete data at any time through Mirror's privacy controls. The export is a JSON file containing only personality vectors and metadata — it explicitly declares that no raw communications are included. Account deletion triggers a full purge of all data from both the device and cloud storage, completing within 30 seconds.

12. Security & Data Protection

Authentication

Mirror supports sign-in via Google OAuth and Apple Sign-In. Authentication is handled through Firebase Authentication, with identity tokens stored securely in the device keychain (iOS) or encrypted shared preferences (Android). Mirror does not implement custom password authentication.

Data at Rest

On-device personality vectors and feature distributions are stored in encrypted local storage. Cloud-synced data is stored in Firestore with Firebase security rules that restrict read and write access to the authenticated user's own documents.

Data in Transit

All network communication uses TLS 1.2 or higher. OAuth token exchanges follow the Authorization Code flow with PKCE (Proof Key for Code Exchange), the industry standard for mobile applications. The critical point is that very little data is in transit — the on-device architecture minimizes network exposure by design.

OAuth Token Management

Gmail OAuth tokens are stored on-device and used only for fetching sent mail during active sync operations. Tokens are never transmitted to Mirror servers. When a user disconnects a data source, access tokens are revoked and deleted from the device. Derived personality vectors from previously analyzed communications are retained unless the user explicitly requests deletion.

Third-Party Data Sharing

Mirror does not share personality data, behavioral data, usage data, or any other user data with third parties, advertisers, data brokers, or analytics providers. The only external services Mirror communicates with are Firebase (for authentication and optional cloud backup) and the respective OAuth providers during sign-in.

13. Limitations & Ethical Considerations

Mirror provides personality estimates, not clinical assessments. Users and any downstream consumers of Mirror data should understand the following limitations:

Accuracy Limitations

Ethical Principles

14. References

  1. Pennebaker, J.W. (2011). The Secret Life of Pronouns: What Our Words Say About Us. Bloomsbury Press.
  2. Tausczik, Y.R. & Pennebaker, J.W. (2010). The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology, 29(1), 24-54.
  3. Yarkoni, T. (2010). Personality in 100,000 Words: A Large-Scale Analysis of Personality and Word Use among Bloggers. Journal of Research in Personality, 44(3), 363-373.
  4. Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., et al. (2013). Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach. PLoS ONE, 8(9), e73791.
  5. Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-Based Personality Judgments Are More Accurate Than Those Made by Humans. Proceedings of the National Academy of Sciences, 112(4), 1036-1040.
  6. Stachl, C., Au, Q., Schoedel, R., et al. (2020). Predicting Personality from Patterns of Behavior Collected with Smartphones. Proceedings of the National Academy of Sciences, 117(30), 17680-17687.
  7. Golbeck, J., Robles, C., Edmondson, M., & Turner, K. (2011). Predicting Personality from Twitter. IEEE International Conference on Social Computing, 149-156.
  8. Gosling, S.D., Rentfrow, P.J., & Swann, W.B. Jr. (2003). A Very Brief Measure of the Big-Five Personality Domains. Journal of Research in Personality, 37(6), 504-528.
  9. Paulhus, D.L. (1991). Measurement and Control of Response Bias. In J.P. Robinson, P.R. Shaver, & L.S. Wrightsman (Eds.), Measures of Personality and Social Psychological Attitudes (pp. 17-59). Academic Press.
  10. McCrae, R.R. & Costa, P.T. Jr. (1989). Reinterpreting the Myers-Briggs Type Indicator from the Perspective of the Five-Factor Model of Personality. Journal of Personality, 57(1), 17-40.
  11. John, O.P., Naumann, L.P., & Soto, C.J. (2008). Paradigm Shift to the Integrative Big Five Trait Taxonomy. In O.P. John, R.W. Robins, & L.A. Pervin (Eds.), Handbook of Personality: Theory and Research (3rd ed., pp. 114-158). Guilford Press.
  12. Park, G., Schwartz, H.A., Eichstaedt, J.C., et al. (2015). Automatic Personality Assessment through Social Media Language. Journal of Personality and Social Psychology, 108(6), 934-952.