AI guardrails: Understanding PII detection

This article highlights the importance of PII, detection methods like regex, Presidio, and transformers, and evaluation with Weave to ensure accurate and adaptable data protection.
Brett Young
Created on December 26|Last edited on March 1
Comment
In today’s interconnected world, Personally Identifiable Information (PII) is at the core of countless digital services and platforms. As data crosses borders freely, establishing clear guardrails for PII has never been more important. These guardrails—built around legal requirements, ethical considerations, and best-practice frameworks—ensure that sensitive data such as names, email addresses, phone numbers, or financial and health records remains protected.
Below, we’ll dive into the fundamentals of PII and explore guardrails designed to protect it. If you’re eager to see these and other guardrails in action, check out the accompanying Colab.
﻿
Otherwise, continue reading for further details and the code you’ll need to work with PII guardrails.
﻿
Table of contentsWhat is PII?Why PII guardrails matterThe importance of PIIPII detection guardrail tutorial Regex-based PII detection guardrailPresidio-based PII detection guardrailTransformer-based PII detectionEvaluation PII guardrails with WeavePerformance analysis Conclusion 
﻿
What is PII?PII is data that identifies someone, from names or emails to financial details. Guardrails—legal, ethical, and technical—ensure this sensitive information stays secure across industries like banking, healthcare, and education.
Personally Identifiable Information restrictions are a set of legal, ethical, and organizational guidelines designed to protect the privacy and security of individuals whose data is collected, stored, and processed. These restrictions vary by jurisdiction and industry but generally aim to limit the use of PII to legitimate purposes while preventing unauthorized access, misuse, or exposure.
Under laws like the General Data Protection Regulation in the European Union, the handling of PII is tightly regulated. GDPR requires that data be collected and processed only for specified, explicit, and legitimate purposes. It mandates informed consent from individuals before their data is collected and imposes strict limitations on sharing or transferring data to third parties, especially across international borders. Organizations must also implement measures like pseudonymization and encryption to safeguard sensitive data, along with providing individuals the right to access, correct, and request the deletion of their data.
In the United States, PII restrictions depend on the context and the specific type of information. For example, the Health Insurance Portability and Accountability Act restricts the handling of health-related PII by healthcare providers, requiring that Protected Health Information (PHI) be safeguarded against unauthorized disclosure. Similarly, the Children's Online Privacy Protection Act (COPPA) imposes strict controls over the collection and use of data from children under 13, emphasizing parental consent and limiting data retention.
Many industries have adopted their own standards to enforce PII restrictions. For instance, the financial sector follows the Payment Card Industry Data Security Standard (PCI DSS) to protect payment and account information, while educational institutions in the U.S. comply with the Family Educational Rights and Privacy Act (FERPA) to protect student records. These frameworks typically require organizations to implement robust access controls, regularly audit data practices, and provide training to employees on data privacy.
Why PII guardrails matterBy requiring data minimization, user consent, and strict security measures, PII guardrails reduce the risk of breaches and unauthorized disclosure. They also provide recourse for individuals who wish to access, correct, or delete their personal data, ensuring ongoing trust in digital services.
The importance of PIIThe importance of PII lies in its intrinsic connection to privacy and security. When handled appropriately, PII allows individuals to access personalized services and enables organizations to operate more effectively. However, the misuse or mishandling of PII can lead to dire consequences, including privacy violations, financial loss, and emotional distress. Cybercriminals often target PII to commit identity theft and fraud, exploiting vulnerabilities in systems or breaches to gain unauthorized access to sensitive data.
Legal frameworks around the world, such as the General Data Protection Regulation in Europe and the California Consumer Privacy Act (CCPA) in the United States, have been established to ensure that organizations collect, store, and process PII responsibly. In the context of healthcare, the Health Insurance Portability and Accountability Act (HIPAA) in the United States sets stringent standards for the protection of health-related PII, referred to as Protected Health Information. HIPAA mandates that healthcare providers, insurers, and their business associates implement measures to safeguard medical records and other health information, ensuring confidentiality, integrity, and availability of this highly sensitive data. Non-compliance with HIPAA can result in severe penalties and undermine patient trust.
As society continues to rely heavily on digital platforms, safeguarding PII is not just a regulatory requirement but also an ethical responsibility. By recognizing the value and vulnerabilities of PII, organizations can take proactive measures to protect this sensitive information in an increasingly data-driven world.
PII detection guardrail tutorial Detecting and protecting PII has become a critical step for maintaining compliance, safeguarding user privacy, and ensuring responsible data practices. In this tutorial, we’ll demonstrate how to set up various PII detection guardrails—from simple regex-based detection to advanced AI-powered methods—to automatically flag and handle PII in text.
To start, first install the safeguards library with the following command: 
git clone https://github.com/soumik12345/safeguards.git && cd safeguards && pip install -e .
Regex-based PII detection guardrailThe RegexEntityRecognitionGuardrail uses predefined patterns to identify PII in text. This straightforward guardrail works well for structured data like phone numbers and email addresses, offering high interpretability and easy configuration. However, its fixed nature means it may not handle edge cases or variations in text effectively.
Using Weave allows us to log and visualize the detected entities for a deeper understanding of the guardrail's performance. Normally, you will need to use the @Weave.op decorator above any function you would like to track, but since the safeguards library has a native integration, we simply need to import and initialize Weave, and our results will be tracked automatically: 
from safeguards.guardrails.entity_recognition import RegexEntityRecognitionGuardrail
import weave; weave.init("guardrails-pii")
# Define hardcoded sample data
test_cases = [
    {
        "input_text": "Contact me at john.doe@example.com or call me at (123) 456-7890.",
        "expected_entities": {
            "EMAIL": ["john.doe@example.com"],
            "TELEPHONENUM": ["(123) 456-7890"],
        },
    },
    {
        "input_text": "My SSN is 123-45-6789, and my credit card is 4111-1111-1111-1111.",
        "expected_entities": {
            "SOCIALNUM": ["123-45-6789"],
            "CREDITCARDNUMBER": ["4111-1111-1111-1111"],
        },
    },
]
﻿
# Initialize the regex-based guardrail
regex_guardrail = RegexEntityRecognitionGuardrail(should_anonymize=True)
﻿
# Process each test case
for i, case in enumerate(test_cases, 1):
    try:
        # Use the `guard` method for PII detection
        result = regex_guardrail.guard(case["input_text"])
        print(f"Test Case {i}")
        print(f"Input: {case['input_text']}")
        print(f"Expected Entities: {case['expected_entities']}")
        print(f"Detected Entities: {result}\n")
    except AttributeError as e:
        print(f"Error processing Test Case {i}: {e}")
﻿
The RegexEntityRecognitionGuardrail was initialized and applied to text samples containing typical PII like email addresses and phone numbers. The guard method returned the detected entities based on its regex patterns, which were logged into Weave. This made it easy to observe where the guardrail performed as expected and where it missed entities. While effective for simple cases, this method’s reliance on fixed patterns limits its adaptability to complex text.
Presidio-based PII detection guardrailThe PresidioEntityRecognitionGuardrail is built on Microsoft’s Presidio framework, combining regex rules with context-aware detection capabilities. This PII guardrail is more adaptable than regex alone and can recognize entities even when their format varies slightly. Here is the code: 
from safeguards.guardrails.entity_recognition import PresidioEntityRecognitionGuardrail
import weave; weave.init("guardrails-pii")
# Define hardcoded sample data
test_cases = [
    {
        "input_text": "Jane's email is jane.doe@gmail.com, and her phone is +1-800-555-1234.",
        "expected_entities": {
            "EMAIL_ADDRESS": ["jane.doe@gmail.com"],
            "PHONE_NUMBER": ["+1-800-555-1234"],
        },
    },
    {
        "input_text": "My passport number is A12345678, and I live in New York.",
        "expected_entities": {
            "US_PASSPORT": ["A12345678"],
            "LOCATION": ["New York"],
        },
    },
]
﻿
# Initialize the Presidio-based guardrail
presidio_guardrail = PresidioEntityRecognitionGuardrail(should_anonymize=True)
﻿
# Process each test case
for i, case in enumerate(test_cases, 1):
    try:
        # Use the `guard` method for PII detection
        result = presidio_guardrail.guard(case["input_text"])
        print(f"Test Case {i}")
        print(f"Input: {case['input_text']}")
        print(f"Expected Entities: {case['expected_entities']}")
        print(f"Detected Entities: {result}\n")
    except AttributeError as e:
        print(f"Error processing Test Case {i}: {e}")
﻿
 The PresidioEntityRecognitionGuardrail was applied to text samples, identifying PII such as email addresses and passport numbers with better flexibility than regex. The guard method handled variations in entity formatting effectively and returned a set of detected entities that were logged into Weave. 
Transformer-based PII detectionThe TransformersEntityRecognitionGuardrail uses machine learning models like transformers to identify PII in text. Its ability to understand context and adapt to unstructured data makes it particularly powerful for complex or nuanced scenarios. This approach requires no predefined rules, relying instead on the capabilities of pre-trained language models. Logging its outputs into Weave allows for detailed performance analysis and comparison against other methods: 
from safeguards.guardrails.entity_recognition import TransformersEntityRecognitionGuardrail
import weave; weave.init("guardrails-pii")
﻿
﻿
# Define hardcoded sample data
test_cases = [
    {
        "input_text": "My name is Brett Johnson, and my phone number is +1 987-654-3210.",
        "expected_entities": {
            "PERSON": ["Alice Johnson"],
            "TELEPHONENUM": ["987-654-3210"],
        },
    },
    {
        "input_text": "The acc. # is 1234532289, and the credit card is 5555-5555-5555-5555.",
        "expected_entities": {
            "IP_ADDRESS": ["192.168.1.1"],
            "CREDITCARDNUMBER": ["5555-5555-5555-5555"],
        },
    },
]
﻿
# Initialize the transformer-based guardrail
transformer_guardrail = TransformersEntityRecognitionGuardrail(should_anonymize=True)
﻿
# Process each test case
for i, case in enumerate(test_cases, 1):
    try:
        # Use the `guard` method for PII detection
        result = transformer_guardrail.guard(case["input_text"])
        print(f"Test Case {i}")
        print(f"Input: {case['input_text']}")
        print(f"Expected Entities: {case['expected_entities']}")
        print(f"Detected Entities: {result}\n")
    except AttributeError as e:
        print(f"Error processing Test Case {i}: {e}")
﻿
The TransformersEntityRecognitionGuardrail processed text samples and detected PII with high accuracy, even in challenging contexts. For instance, it correctly identified entities embedded in longer sentences or with ambiguous structures. The detected entities were logged into Weave, making it easy to compare its performance with the other guardrails. Here's what it looks like inside Weave after running our script: 
﻿
Evaluation PII guardrails with WeaveAfter detecting PII with each guardrail, evaluating their performance systematically is crucial. Using Weave’s evaluation framework, the detected entities and associated metrics like precision, recall, and F1-score can be logged and compared to gain a clear picture of their effectiveness. This provides actionable insights into how each guardrail performs on different types of data and helps identify the most suitable method for specific use cases.
We will evaluate three different PII detection guardrails: RegexEntityRecognitionGuardrail, PresidioEntityRecognitionGuardrail, and TransformersEntityRecognitionGuardrail. The goal is to measure their performance in identifying PII entities across a dataset, using evaluation metrics such as precision, recall, and F1-score. Here is the code that will run our evaluation: 
import asyncio
import json
import random
from pathlib import Path
from typing import Dict, List, Optional
﻿
import weave
from datasets import load_dataset
from weave import Evaluation
from weave.scorers import Scorer
﻿
from safeguards.guardrails.entity_recognition import (
    RegexEntityRecognitionGuardrail, 
    PresidioEntityRecognitionGuardrail, 
    TransformersEntityRecognitionGuardrail
)
﻿
# Add this mapping dictionary near the top of the file
PRESIDIO_TO_TRANSFORMER_MAPPING = {
    "EMAIL_ADDRESS": "EMAIL",
    "PHONE_NUMBER": "TELEPHONENUM",
    "US_SSN": "SOCIALNUM",
    "CREDIT_CARD": "CREDITCARDNUMBER",
    "IP_ADDRESS": "IDCARDNUM",
    "DATE_TIME": "DATEOFBIRTH",
    "US_PASSPORT": "IDCARDNUM",
    "US_DRIVER_LICENSE": "DRIVERLICENSENUM",
    "US_BANK_NUMBER": "ACCOUNTNUM",
    "LOCATION": "CITY",
    "URL": "USERNAME",  # URLs often contain usernames
    "IN_PAN": "TAXNUM",  # Indian Permanent Account Number
    "UK_NHS": "IDCARDNUM",
    "SG_NRIC_FIN": "IDCARDNUM",
    "AU_ABN": "TAXNUM",  # Australian Business Number
    "AU_ACN": "TAXNUM",  # Australian Company Number
    "AU_TFN": "TAXNUM",  # Australian Tax File Number
    "AU_MEDICARE": "IDCARDNUM",
    "IN_AADHAAR": "IDCARDNUM",  # Indian national ID
    "IN_VOTER": "IDCARDNUM",
    "IN_PASSPORT": "IDCARDNUM",
    "CRYPTO": "ACCOUNTNUM",  # Cryptocurrency addresses
    "IBAN_CODE": "ACCOUNTNUM",
    "MEDICAL_LICENSE": "IDCARDNUM",
    "IN_VEHICLE_REGISTRATION": "IDCARDNUM",
}
﻿
﻿
class EntityRecognitionScorer(Scorer):
    """Scorer for evaluating entity recognition performance"""
﻿
    @weave.op()
    async def score(
        self, model_output: Optional[dict], input_text: str, expected_entities: Dict
    ) -> Dict:
        """Score entity recognition results"""
        if not model_output:
            return {"f1": 0.0}
﻿
        # Convert Pydantic model to dict if necessary
        if hasattr(model_output, "model_dump"):
            model_output = model_output.model_dump()
        elif hasattr(model_output, "dict"):
            model_output = model_output.dict()
﻿
        detected = model_output.get("detected_entities", {})
﻿
        # Map Presidio entities if needed
        if model_output.get("model_type") == "presidio":
            mapped_detected = {}
            for entity_type, values in detected.items():
                mapped_type = PRESIDIO_TO_TRANSFORMER_MAPPING.get(entity_type)
                if mapped_type:
                    if mapped_type not in mapped_detected:
                        mapped_detected[mapped_type] = []
                    mapped_detected[mapped_type].extend(values)
            detected = mapped_detected
﻿
        # Track entity-level metrics
        all_entity_types = set(list(detected.keys()) + list(expected_entities.keys()))
        entity_metrics = {}
﻿
        for entity_type in all_entity_types:
            detected_set = set(detected.get(entity_type, []))
            expected_set = set(expected_entities.get(entity_type, []))
﻿
            # Calculate metrics
            true_positives = len(detected_set & expected_set)
            false_positives = len(detected_set - expected_set)
            false_negatives = len(expected_set - detected_set)
﻿
            if entity_type not in entity_metrics:
                entity_metrics[entity_type] = {
                    "total_true_positives": 0,
                    "total_false_positives": 0,
                    "total_false_negatives": 0,
                }
﻿
            entity_metrics[entity_type]["total_true_positives"] += true_positives
            entity_metrics[entity_type]["total_false_positives"] += false_positives
            entity_metrics[entity_type]["total_false_negatives"] += false_negatives
﻿
            # Calculate per-entity metrics
            precision = (
                true_positives / (true_positives + false_positives)
                if (true_positives + false_positives) > 0
                else 0
            )
            recall = (
                true_positives / (true_positives + false_negatives)
                if (true_positives + false_negatives) > 0
                else 0
            )
            f1 = (
                2 * (precision * recall) / (precision + recall)
                if (precision + recall) > 0
                else 0
            )
﻿
            entity_metrics[entity_type].update(
                {"precision": precision, "recall": recall, "f1": f1}
            )
﻿
        # Calculate overall metrics
        total_tp = sum(
            metrics["total_true_positives"] for metrics in entity_metrics.values()
        )
        total_fp = sum(
            metrics["total_false_positives"] for metrics in entity_metrics.values()
        )
        total_fn = sum(
            metrics["total_false_negatives"] for metrics in entity_metrics.values()
        )
﻿
        overall_precision = (
            total_tp / (total_tp + total_fp) if (total_tp + total_fp) > 0 else 0
        )
        overall_recall = (
            total_tp / (total_tp + total_fn) if (total_tp + total_fn) > 0 else 0
        )
        overall_f1 = (
            2
            * (overall_precision * overall_recall)
            / (overall_precision + overall_recall)
            if (overall_precision + overall_recall) > 0
            else 0
        )
﻿
        entity_metrics["overall"] = {
            "precision": overall_precision,
            "recall": overall_recall,
            "f1": overall_f1,
            "total_true_positives": total_tp,
            "total_false_positives": total_fp,
            "total_false_negatives": total_fn,
        }
﻿
        return entity_metrics["overall"]
﻿
﻿
def load_ai4privacy_dataset(
    num_samples: int = 100, split: str = "validation"
) -> List[Dict]:
    """
    Load and prepare samples from the ai4privacy dataset.
﻿
    Args:
        num_samples: Number of samples to evaluate
        split: Dataset split to use ("train" or "validation")
﻿
    Returns:
        List of prepared test cases
    """
    # Load the dataset
    dataset = load_dataset("ai4privacy/pii-masking-400k")
﻿
    # Get the specified split
    data_split = dataset[split]
﻿
    # Randomly sample entries if num_samples is less than total
    if num_samples < len(data_split):
        indices = random.sample(range(len(data_split)), num_samples)
        samples = [data_split[i] for i in indices]
    else:
        samples = data_split
﻿
    # Convert to test case format
    test_cases = []
    for sample in samples:
        # Extract entities from privacy_mask
        entities: Dict[str, List[str]] = {}
        for entity in sample["privacy_mask"]:
            label = entity["label"]
            value = entity["value"]
            if label not in entities:
                entities[label] = []
            entities[label].append(value)
﻿
        test_case = {
            "description": f"AI4Privacy Sample (ID: {sample['uid']})",
            "input_text": sample["source_text"],
            "expected_entities": entities,
            "masked_text": sample["masked_text"],
            "language": sample["language"],
            "locale": sample["locale"],
        }
        test_cases.append(test_case)
﻿
    return test_cases
﻿
﻿
def save_results(
    weave_results: Dict, model_name: str, output_dir: str = "evaluation_results"
):
    """Save evaluation results to files"""
    output_dir = Path(output_dir)
    output_dir.mkdir(exist_ok=True)
﻿
    # Extract and process results
    scorer_results = weave_results.get("EntityRecognitionScorer", [])
    if not scorer_results or all(r is None for r in scorer_results):
        print(f"No valid results to save for {model_name}")
        return
﻿
    # Calculate summary metrics
    total_samples = len(scorer_results)
    passed = sum(1 for r in scorer_results if r is not None and not isinstance(r, str))
﻿
    # Aggregate entity-level metrics
    entity_metrics = {}
    for result in scorer_results:
        try:
            if isinstance(result, str) or not result:
                continue
﻿
            for entity_type, metrics in result.items():
                if entity_type not in entity_metrics:
                    entity_metrics[entity_type] = {
                        "precision": [],
                        "recall": [],
                        "f1": [],
                    }
                entity_metrics[entity_type]["precision"].append(metrics["precision"])
                entity_metrics[entity_type]["recall"].append(metrics["recall"])
                entity_metrics[entity_type]["f1"].append(metrics["f1"])
        except (AttributeError, TypeError, KeyError):
            continue
﻿
    # Calculate averages
    summary_metrics = {
        "total": total_samples,
        "passed": passed,
        "failed": total_samples - passed,
        "success_rate": (passed / total_samples) if total_samples > 0 else 0,
        "entity_metrics": {
            entity_type: {
                "precision": (
                    sum(metrics["precision"]) / len(metrics["precision"])
                    if metrics["precision"]
                    else 0
                ),
                "recall": (
                    sum(metrics["recall"]) / len(metrics["recall"])
                    if metrics["recall"]
                    else 0
                ),
                "f1": sum(metrics["f1"]) / len(metrics["f1"]) if metrics["f1"] else 0,
            }
            for entity_type, metrics in entity_metrics.items()
        },
    }
﻿
    # Save files
    with open(output_dir / f"{model_name}_metrics.json", "w") as f:
        json.dump(summary_metrics, f, indent=2)
﻿
    # Save detailed results, filtering out string results
    detailed_results = [
        r for r in scorer_results if not isinstance(r, str) and r is not None
    ]
    with open(output_dir / f"{model_name}_detailed_results.json", "w") as f:
        json.dump(detailed_results, f, indent=2)
﻿
﻿
def print_metrics_summary(weave_results: Dict):
    """Print a summary of the evaluation metrics"""
    print("\nEvaluation Summary")
    print("=" * 80)
﻿
    # Extract results from Weave's evaluation format
    scorer_results = weave_results.get("EntityRecognitionScorer", {})
    if not scorer_results:
        print("No valid results available")
        return
﻿
    # Calculate overall metrics
    total_samples = int(weave_results.get("model_latency", {}).get("count", 0))
    passed = total_samples  # Since we have results, all samples passed
    failed = 0
﻿
    print(f"Total Samples: {total_samples}")
    print(f"Passed: {passed}")
    print(f"Failed: {failed}")
    print(f"Success Rate: {(passed/total_samples)*100:.2f}%")
﻿
    # Print overall metrics
    if "overall" in scorer_results:
        overall = scorer_results["overall"]
        print("\nOverall Metrics:")
        print("-" * 80)
        print(f"{'Metric':<20} {'Value':>10}")
        print("-" * 80)
        print(f"{'Precision':<20} {overall['precision']['mean']:>10.2f}")
        print(f"{'Recall':<20} {overall['recall']['mean']:>10.2f}")
        print(f"{'F1':<20} {overall['f1']['mean']:>10.2f}")
﻿
    # Print entity-level metrics
    print("\nEntity-Level Metrics:")
    print("-" * 80)
    print(f"{'Entity Type':<20} {'Precision':>10} {'Recall':>10} {'F1':>10}")
    print("-" * 80)
﻿
    for entity_type, metrics in scorer_results.items():
        if entity_type == "overall":
            continue
﻿
        precision = metrics.get("precision", {}).get("mean", 0)
        recall = metrics.get("recall", {}).get("mean", 0)
        f1 = metrics.get("f1", {}).get("mean", 0)
﻿
        print(f"{entity_type:<20} {precision:>10.2f} {recall:>10.2f} {f1:>10.2f}")
﻿
﻿
def preprocess_model_input(example: Dict) -> Dict:
    """Preprocess dataset example to match model input format."""
    return {
        "prompt": example["input_text"],
        "model_type": example.get(
            "model_type", "unknown"
        ),  # Add model type for Presidio mapping
    }
﻿
﻿
def main():
    """Main evaluation function"""
    weave.init("guardrails-genie-pii-evaluation")
﻿
    # Load test cases
    test_cases = load_ai4privacy_dataset(num_samples=100)
﻿
    # Add model type to test cases for Presidio mapping
    models = {
        "regex": RegexEntityRecognitionGuardrail(should_anonymize=True),
        "presidio": PresidioEntityRecognitionGuardrail(should_anonymize=True),
        "transformers": TransformersEntityRecognitionGuardrail(should_anonymize=True)
    }
﻿
    scorer = EntityRecognitionScorer()
﻿
    # Evaluate each model
    for model_name, guardrail in models.items():
        print(f"\nEvaluating {model_name} model...")
        # Add model type to test cases
        model_test_cases = [{**case, "model_type": model_name} for case in test_cases]
﻿
        evaluation = Evaluation(
            dataset=model_test_cases,
            scorers=[scorer],
            preprocess_model_input=preprocess_model_input,
        )
﻿
        asyncio.run(evaluation.evaluate(guardrail))
﻿
﻿
if __name__ == "__main__":
    main()
We begin defining a mapping dictionary to normalize entity types between Presidio and transformer-based models, ensuring consistency in the evaluation process. A custom EntityRecognitionScorer class is implemented to handle entity-level comparisons and calculate the evaluation metrics. This scorer takes into account true positives, false positives, and false negatives for each entity type.
A dataset is then prepared using the load_ai4privacy_dataset function, which extracts and structures test cases for evaluation. Each guardrail is applied to the dataset, and the detected entities are compared with the expected ones. The results are logged and saved using Weave’s evaluation framework, allowing for detailed analysis and visualization of the guardrails’ performance.
Weave logs provide a clear visualization of each guardrail’s performance, including metrics for overall detection accuracy as well as detailed metrics for each individual entity type. This granular logging makes it easy to compare results across the methods, highlighting how each guardrail performs for specific PII categories. This evaluation helps in selecting the most suitable guardrail based on the specific requirements of a use case, such as simplicity, adaptability, or robustness in handling complex data.
I'll share a table of some of the performance metrics down below: 
Model Performance Summary
































ModelSuccess RateOverall PrecisionOverall RecallOverall F1
Regex0.0%0.030.500.06
Presidio12.0%0.090.170.12
Transformers77.0%0.810.830.82
﻿
Detailed Entity-Level F1 Scores
















































































































Entity TypeRegexPresidioTransformers
EMAIL0.931.001.00
SURNAME0.050.000.86
TELEPHONENUM0.000.130.82
GIVENNAME0.080.000.90
CITY0.060.000.92
DRIVERLICENSENUM0.000.110.91
STREET0.000.000.89
TAXNUM0.000.031.00
USERNAME0.000.000.75
PASSWORD0.000.000.75
ZIPCODE0.420.000.53
ACCOUNTNUM0.240.200.77
DATEOFBIRTH0.000.001.00
IDCARDNUM0.000.130.67
CREDITCARDNUMBER0.400.330.80
BUILDINGNUM0.040.000.55
SOCIALNUM0.220.220.50
﻿
Performance analysis The performance summary highlights significant variations in model effectiveness across methods. Regex-based approaches show poor performance with a success rate of 0.0%, an F1 score of 0.06, and low precision and recall, indicating its inability to capture relevant patterns effectively.
Presidio performs moderately, achieving a 12.0% success rate and an F1 score of 0.12, suggesting it can identify some entities but struggles with accuracy and consistency. Transformers dominate with a 77.0% success rate and an F1 score of 0.82, reflecting their robustness and superior ability to balance precision and recall at entity-level recognition. This demonstrates that advanced machine learning models, such as Transformers, significantly outperform traditional methods like regex and rule-based approaches in this task.
Layering your approach with multiple methods can be an effective strategy. Starting with regex allows for quick filtering, followed by deeper analysis on flagged content to enhance accuracy, and combining methods ensures robust performance in critical applications. Customizing the approach to fit specific domains can also be beneficial, which includes adding industry-specific patterns, training models on domain-relevant data formats, and fine-tuning confidence thresholds.
Finally, regular monitoring can be a major force for system improvement, allowing us to track false positives and negatives, update patterns to address missed cases, and retrain models to keep them up-to-date.
Conclusion Robust guardrails for detecting and protecting PII are vital in our data-driven world. Whether you choose a fast and interpretable approach like regex, or opt for the nuanced capabilities of transformer-based models, your guardrail strategy should prioritize continuous monitoring, updates, and alignment with regulatory standards.
Tools like Weave provide visibility into how each guardrail performs, letting you refine your overall data protection strategy. By understanding the strengths and limitations of each approach and updating them regularly, you can maintain high standards of privacy and security, ensuring that sensitive personal information remains well-guarded.
Securing your LLM applications against prompt injection attacks
We will focus on understanding prompt injection attacks in AI systems and explore effective strategies to prevent against them!
PHI and PII for healthcare in the world of AI
A practical guide on working with health data, safely, with multiple approaches for handling PHI
Creating a predictive models to assess the risk of mortgage clients
My top tips for competing in Kaggle Challenges like the Home Credit Risk Model Stability Challenge.
Evaluating LLMs on Amazon Bedrock
Discover how to use Amazon Bedrock in combination with W&B Weave to evaluate and compare Large Language Models (LLMs) for summarization tasks, leveraging Bedrock’s managed infrastructure and Weave’s advanced evaluation features.  
﻿
﻿
Model	Success Rate	Overall Precision	Overall Recall	Overall F1
Regex	0.0%	0.03	0.50	0.06
Presidio	12.0%	0.09	0.17	0.12
Transformers	77.0%	0.81	0.83	0.82
Entity Type	Regex	Presidio	Transformers
EMAIL	0.93	1.00	1.00
SURNAME	0.05	0.00	0.86
TELEPHONENUM	0.00	0.13	0.82
GIVENNAME	0.08	0.00	0.90
CITY	0.06	0.00	0.92
DRIVERLICENSENUM	0.00	0.11	0.91
STREET	0.00	0.00	0.89
TAXNUM	0.00	0.03	1.00
USERNAME	0.00	0.00	0.75
PASSWORD	0.00	0.00	0.75
ZIPCODE	0.42	0.00	0.53
ACCOUNTNUM	0.24	0.20	0.77
DATEOFBIRTH	0.00	0.00	1.00
IDCARDNUM	0.00	0.13	0.67
CREDITCARDNUMBER	0.40	0.33	0.80
BUILDINGNUM	0.04	0.00	0.55
SOCIALNUM	0.22	0.22	0.50
Add a comment
Tags: Articles, LLM, Weave, GenAI
Iterate on AI agents and models faster. Try Weights & Biases today.