Zero-question personality profiling through on-device communication pattern analysis
MVAT Mirror is a mobile application that builds personality profiles from communication patterns without requiring users to answer a single questionnaire question. By analyzing structural properties of writing — sentence length, question frequency, vocabulary diversity, response timing, and other statistical features — Mirror produces Big Five (OCEAN) personality scores with quantified confidence intervals.
All analysis runs on-device. Raw message content is processed in-memory, converted into numerical feature vectors, and immediately discarded. Only derived personality scores are stored or synced. This architecture makes it structurally impossible for Mirror to read, store, or transmit the content of user communications.
This paper describes the scientific foundations underpinning Mirror's analysis, the feature extraction pipeline, the confidence scoring model, the privacy architecture, and the supported personality frameworks.
Traditional personality assessment relies on self-report questionnaires — instruments like the NEO-PI-R, the BFI-44, or the MBTI Form M. While these instruments have significant empirical backing, they share structural limitations that reduce their practical utility for most people:
Computer-based behavioral analysis addresses each of these limitations. Writing style is difficult to consciously manipulate, is measured against absolute scales rather than subjective reference groups, can be aggregated across hundreds of samples to reduce noise, imposes zero burden on the user, and naturally tracks longitudinal patterns.
Key finding: Youyou, Kosinski, and Stillwell (2015) demonstrated that computer models predicted personality from digital behavior more accurately than most human judges. With 300+ behavioral signals, the computer model achieved r = 0.56 against self-reports — surpassing the accuracy of coworkers (r = 0.27), friends (r = 0.45), and family members (r = 0.49). Only spouses performed comparably (r = 0.58).
Mirror takes a fundamentally different approach to personality assessment. Rather than asking users to describe themselves, Mirror observes how they naturally communicate — and derives personality indicators from the structural patterns in that communication.
The key distinction is between content and style. Mirror never interprets what a message says — it measures how the message is structured. Consider two messages:
Mirror does not evaluate the topic being discussed. Instead, it extracts structural features: message 1 has 32 words, 3 questions, 2 hedging markers ("maybe", "I'm not sure"), high first-person pronoun usage, and a collaborative closing. Message 2 has 3 words, 0 questions, 0 hedging markers, and a directive structure. These structural differences correlate with measurable personality trait differences in Agreeableness, Openness, and Extraversion.
The insight that function words (pronouns, prepositions, articles) reveal more about personality than content words (nouns, verbs) was established by James Pennebaker's pioneering work at the University of Texas at Austin. His Linguistic Inquiry and Word Count (LIWC) framework demonstrated that the small, easily-overlooked words people use — "I" vs. "we", "but" vs. "and", "maybe" vs. "definitely" — are more predictive of personality traits than topic choice or vocabulary sophistication (Pennebaker, 2011).
Mirror builds on this research with a feature extraction approach inspired by LIWC's lexicon-based methodology, adapted for mobile communication patterns including email, SMS, and messaging metadata.
Mirror's primary personality framework is the Big Five model, the most empirically validated framework in personality psychology. Unlike categorical systems (e.g., MBTI's 16 types), the Big Five describes personality as positions on five continuous spectra. This continuous measurement allows for nuanced, reproducible assessment and enables statistical confidence intervals.
Display convention: Mirror displays the fifth dimension as "Emotional Stability" rather than "Neuroticism." The underlying score is computed as neuroticism and inverted for display, so a high Emotional Stability score corresponds to low neuroticism. This reframing follows positive psychology conventions and avoids pathologizing language.
Each trait is scored on a continuous scale from 0 to 100, where 50 represents the population average. Scores are accompanied by confidence intervals that narrow as more communication data is analyzed. A score of 72 in Openness with a confidence interval of [65, 79] means Mirror estimates the user's Openness is above average, and is 95% confident the true value falls within that range.
Mirror's analysis pipeline extracts 12 numerical features from each communication event. These features are the only output of raw text processing — once extracted, the source text is discarded from memory. The feature vector is the atomic unit of Mirror's analysis.
| Feature | Type | Description |
|---|---|---|
| wordCount | Integer | Total number of words in the message |
| questionCount | Integer | Number of question marks or interrogative constructions |
| exclamationCount | Integer | Number of exclamation marks |
| firstPersonPronounRatio | Float [0, 1] | Proportion of words that are first-person pronouns (I, me, my, mine) |
| positiveEmotionWords | Integer | Count of words from positive emotion lexicon (joy, love, great, excited) |
| negativeEmotionWords | Integer | Count of words from negative emotion lexicon (hate, angry, sad, worried) |
| socialWords | Integer | Count of social and cooperative terms (we, together, team, help) |
| cognitiveComplexityWords | Integer | Count of causal and reasoning markers (because, therefore, however, although) |
| avgSentenceLength | Float | Mean number of words per sentence |
| vocabularyDiversity | Float [0, 1] | Type-token ratio: unique words divided by total words |
| emojiCount | Integer | Number of emoji characters used |
| isReply | Boolean | Whether the message is a response to another message (contextual estimate) |
Four of the twelve features — positiveEmotionWords, negativeEmotionWords, socialWords, and cognitiveComplexityWords — rely on curated word lists (lexicons). Mirror's approach is inspired by the LIWC (Linguistic Inquiry and Word Count) framework developed by Pennebaker and colleagues (Tausczik & Pennebaker, 2010). Each lexicon category contains carefully selected marker words and their common variations.
The lexicon approach has a critical advantage for privacy: it requires only word-level matching against predefined lists, not sentence-level comprehension. Mirror counts how many words from each category appear — it does not parse grammar, resolve references, or interpret meaning. The lexicons are static assets bundled with the application; no network lookup is required.
Individual message features are aggregated into running statistical distributions. Mirror tracks the mean, variance, and trend (slope over time) for each numerical feature across all analyzed messages. This aggregation serves two purposes: it smooths out noise from individual messages, and it enables trend detection — personality expression is not static, and Mirror can observe shifts in communication patterns over time.
Mirror maps extracted features to Big Five trait scores using weighted scoring functions derived from published correlation matrices. Each trait draws signal from multiple features, and each feature may contribute to multiple traits. The mapping reflects empirical findings from computational linguistics research.
The following summarizes the primary directional relationships between features and Big Five dimensions, based on research by Yarkoni (2010), Schwartz et al. (2013), and Pennebaker (2011):
For each Big Five dimension, Mirror computes a weighted sum of normalized feature values. Features are normalized against population baselines derived from published research on email and messaging corpora. The weighted sum is passed through a sigmoid function to produce a score between 0 and 100, centered at 50 (the population mean).
Weights are not learned from Mirror user data — they are derived from published correlation coefficients in the personality-language literature. This design choice means Mirror does not require a training dataset of users who have taken both questionnaires and had their writing analyzed. It also means Mirror never builds behavioral models from user data that could be subject to data mining concerns.
Personality estimation accuracy improves with the volume of analyzed communication. Mirror quantifies this through two complementary systems: per-trait confidence intervals and an overall profile maturity indicator.
Each Big Five trait score is accompanied by a confidence value between 0.0 and 1.0, representing Mirror's certainty in the estimate. Confidence is computed from the number of events analyzed for that trait and the consistency (low variance) of the underlying feature distributions. A trait with high event count but high feature variance will have lower confidence than a trait with fewer events but highly consistent signals.
The overall profile maturity reflects the total number of communication events processed across all connected data sources. This provides users with a simple, intuitive sense of how complete their profile is.
Mirror also provides an estimated time-to-confident calculation based on the user's current data accumulation rate. If a user is generating 15 events per day and has processed 200, Mirror will display "~20 more days of your normal messaging" as a progress indicator.
Before a full personality profile forms, users want meaningful feedback. Mirror's early signals system surfaces notable patterns as soon as 50 communication events have been processed — well before the 500-event threshold for a confident profile.
Early signals are observations about communication patterns that hint at personality traits without committing to a full score. Each early signal includes:
Early signals are generated by detecting statistical outliers in feature distributions relative to population baselines. If a user's question frequency is consistently above the 75th percentile after 50 messages, Mirror surfaces this as an early signal for Openness — without assigning a numerical score.
Design principle: Early signals use qualitative language ("more than average", "notably consistent") rather than numerical scores. This prevents users from anchoring on premature estimates that may shift significantly as more data arrives.
While the Big Five model serves as Mirror's foundation, the app supports seven personality frameworks total. Six additional frameworks are available to Pro subscribers. All derived frameworks are computed from the Big Five scores using published mapping functions — they do not require separate data analysis.
The Big Five model serves as a universal foundation because it has well-documented statistical relationships with most other personality frameworks. Research has established reliable mapping functions between Big Five scores and MBTI preferences (McCrae & Costa, 1989), Enneagram types (Dris & Noftle, 2011), DISC dimensions, and attachment style classifications. Mirror uses these published mapping functions to derive secondary framework scores from the primary Big Five analysis, rather than performing separate language analysis for each framework.
Mirror connects to three categories of communication data. Each source undergoes the same feature extraction pipeline. Free accounts can connect up to 3 sources; Pro accounts have no limit.
Mirror requests read access to sent mail only — it does not access received emails, drafts, or other mailbox contents. Each sent email is processed in-memory to extract a feature vector. The raw email body is never written to disk, stored in a database, or transmitted to any server. OAuth tokens are stored securely on-device and can be revoked at any time through Mirror's settings.
On Android, Mirror accesses SMS content with user permission to extract writing pattern features. On iOS, due to platform privacy restrictions, Mirror cannot access iMessage or SMS content directly. On iOS, Mirror falls back to contact frequency and timing metadata analysis only, which provides Conscientiousness and Extraversion signals but limited Openness and Agreeableness data.
Mirror analyzes call duration and frequency patterns — it does not record or analyze call audio. Call log data contributes primarily to Extraversion signals (call frequency, average duration, breadth of contacts) and Conscientiousness signals (consistency of calling patterns, time-of-day regularity).
Users can optionally import historical communications to accelerate profile maturity. Historical import requires explicit opt-in and processes messages in batches with a progress indicator. This is the fastest path to a confident profile — a user with 500+ sent emails in their Gmail account can reach confident profile status immediately after the initial import completes.
Privacy in Mirror is not a policy decision — it is a structural property of the system architecture. The service interfaces are designed so that raw text physically cannot be transmitted beyond the feature extraction boundary.
Mirror's code architecture enforces a strict boundary between text processing and personality analysis. The data ingestion service processes raw text and outputs only feature vectors (the 12 numerical features described in Section 5). The personality analysis service accepts only feature vectors as input — it has no API for receiving raw text. This structural API design means that even a code bug or misconfiguration cannot cause raw text to reach the cloud sync layer, because the types do not permit it.
Users can export their complete data at any time through Mirror's privacy controls. The export is a JSON file containing only personality vectors and metadata — it explicitly declares that no raw communications are included. Account deletion triggers a full purge of all data from both the device and cloud storage, completing within 30 seconds.
Mirror supports sign-in via Google OAuth and Apple Sign-In. Authentication is handled through Firebase Authentication, with identity tokens stored securely in the device keychain (iOS) or encrypted shared preferences (Android). Mirror does not implement custom password authentication.
On-device personality vectors and feature distributions are stored in encrypted local storage. Cloud-synced data is stored in Firestore with Firebase security rules that restrict read and write access to the authenticated user's own documents.
All network communication uses TLS 1.2 or higher. OAuth token exchanges follow the Authorization Code flow with PKCE (Proof Key for Code Exchange), the industry standard for mobile applications. The critical point is that very little data is in transit — the on-device architecture minimizes network exposure by design.
Gmail OAuth tokens are stored on-device and used only for fetching sent mail during active sync operations. Tokens are never transmitted to Mirror servers. When a user disconnects a data source, access tokens are revoked and deleted from the device. Derived personality vectors from previously analyzed communications are retained unless the user explicitly requests deletion.
Mirror does not share personality data, behavioral data, usage data, or any other user data with third parties, advertisers, data brokers, or analytics providers. The only external services Mirror communicates with are Firebase (for authentication and optional cloud backup) and the respective OAuth providers during sign-in.
Mirror provides personality estimates, not clinical assessments. Users and any downstream consumers of Mirror data should understand the following limitations: