eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

218
active users

#formalmethods

1 post1 participant0 posts today
Cheng Huang’s corner · The Coming AI Revolution in Distributed SystemsFormal verification has long been the gold standard for uncovering subtle bugs in distributed system design [1]. While AI has already proven its ability to accelerate verification processes [2], recent breakthroughs suggest a far more transformative potential: AI can now autonomously generate accurate formal specifications directly from very large production codebases. This capability marks a pivotal moment in software engineering, pointing toward a future where AI-driven correctness verification becomes not just standard practice, but potentially superior to human efforts 1. AI-driven Bug Discovery Our recent work with GitHub Copilot demonstrates this potential in action. The AI autonomously produced precise TLA+ specifications from Azure Storage’s production source code, uncovering a subtle race condition that had evaded traditional code reviews and extensive testing. This achievement illustrates how AI can transform our approach to ensuring correctness in complex distributed systems. Strategic Planning: AI’s Methodical Approach We began by tasking GitHub Copilot (with o3 model) to formulate a comprehensive plan for analyzing a specific feature and generating a corresponding TLA+ specification. The AI responded with an 8-step methodology that demonstrated deep understanding of formal verification principles: Systematic cataloging of code paths, data structures, and Paxos/server commands Extraction of behavioral patterns and invariants from source code Construction of a minimal yet complete TLA+ model Integration of failure modes and safety/liveness properties From Plan to Execution: AI in Action 1. Autonomous Code Analysis and Initial Specification After reviewing the AI-generated plan, we instructed GitHub Copilot Agent (with Claude 3.7 Sonnet model) to execute each step, marking completion as it progressed. Notably, we only provided the feature name and the relevant component directory to narrow the search scope. The AI autonomously identified pertinent files and extracted essential information without further guidance. Subsequently, the AI produced comprehensive architecture and behavior documentation, along with an initial TLA+ specification containing over 10 invariants. Impressively, the primary safety invariant matched precisely what we anticipated. 2. Iterative Refinement of the Specification Upon reviewing the initial specification, we noticed the absence of an etag-based optimistic concurrency control mechanism. We prompted the AI to investigate how etags were utilized to prevent race conditions. In response, the AI revisited the source code, updated the documentation accordingly, and refined the TLA+ specification to incorporate this critical mechanism. We observed that the AI refined the specification iteratively, updating one atomic action at a time. This incremental approach likely helped the AI maintain precision and accuracy throughout the refinement process. 3. Validation through Model Checking With the refined specification ready, we proceeded to validate it using the TLA+ model checker. Initially, the tool reported a language error, which the AI swiftly corrected. Subsequent model checking uncovered a violation of the primary safety invariant. After providing the violation trace to the AI, it quickly identified the problematic sequence — a concurrent deletion and reference addition scenario. 4. Enhancing the Specification via Git History Analysis To ensure comprehensive coverage, we asked the AI to analyze the git commit history for the feature using git MCP. This analysis revealed an additional critical safety mechanism involving pessimistic locking, previously overlooked. The AI promptly updated both the documentation and the TLA+ specification to reflect this important discovery. The Critical Discovery Within hours of iterative refinement, the AI had surfaced a critical race condition: an old Paxos primary could perform a deletion while a new primary simultaneously added a reference. This subtle bug had escaped detection through traditional methods, yet given Azure Storage’s scale, it would likely have manifested in production eventually. Looking Ahead: The Autonomous Future After a decade of manually crafting TLA+ specifications, I must acknowledge that this AI-generated specification rivals human work. This achievement represents more than incremental progress — it demonstrates that the fundamental components for fully automating correctness verification are now available. Based on current capabilities, I envision the following evolution: Autonomous System Analysis: AI will independently dissect large-scale distributed systems, mapping critical components and their interactions. Intelligent Invariant Discovery: AI will identify and formalize key system invariants without human guidance. Self-Validating Specifications: AI-generated TLA+ models will undergo automatic validation, with AI analyzing any violations Violations will trigger AI-crafted targeted tests Confirmed bugs will be automatically corrected Model-implementation discrepancies will drive continuous specification refinement Reinforcement Learning Feedback Loop: Each iteration will generate valuable data, enabling AI to continuously improve its verification capabilities. Mastery Through Clear Objectives: Like other domains with verifiable outcomes, distributed system correctness will become an area where AI consistently outperforms human practitioners. While we haven’t reached this destination yet, the path forward is clear. Even if AI development paused today, the foundation for an AI-driven correctness revolution has already been laid. Opinions are personal. ↩

If you’re a #haskell user or a proponent of #functionalprogramming please consider donating to the #Purescript project to keep it alive.

It’s a fairly obscure project but, IMO, it is THE language for web #frontend.

On top of that, the backend was recently rewritten in #ChezScheme which tends to be very popular choice in the #formalmethods world due to its rigor.

I see #ghcjs nipping at its heels but IMO, PS will always be a more bespoke and opinionated tool.

opencollective.com/purescript/

opencollective.comContribute - Open Collective

*Last Call*

I have a #PhD position for UK students, available with myself and @bentnib

This project will be looking at developing new methods for asserting the resilience of existing communicating systems by developing new static analysis methods derived from advanced programming language research.

*Hard Deadline*: Wednesday 16th April 2025

You will belong to @StrathCyber and @mspstrath, as well as gaining access to @spli

strath.ac.uk/studywithus/postg

(Ignore the deadline on the advert)

Please spread the words.

www.strath.ac.ukTowards Type-Driven Assurance of Communicating Systems | University of Strathclyde

Reading the new experience report paper "System Correctness Practices at AWS" by @marcbrooker & Ankush Desai, a successor to 2015 paper "How Amazon Web Services Uses Formal Methods". Documents a whole buffet of industrial formal methods use: P (including new tool PObserve for runtime trace validation), deterministic simulation testing in Rust with the open-sourced Shuttle and Turmoil tools, Dafny, HOL Light, and the open-sourced Kani model-checker for Rust.

While TLA⁺ was the most prominent featured tool in the 2015 paper, it's been lost in the crowd here as part of a clear shift toward verifying & testing the actual running code. I think TLA⁺ must carve out a niche for itself in a world where deterministic simulation testing becomes a commodity technology, or it risks losing relevance same as other design-level tools like UML. There are existing case studies on using TLA⁺ for trace validation and model-driven testing, but a lot of effort needs to go into tooling for making such integrations as smooth as possible instead of bespoke one-off projects.

dl.acm.org/doi/10.1145/3712057

QueueSystems Correctness Practices at AWS: Leveraging Formal and Semi-formal Methods: Queue: Vol 22, No 6 Building reliable and secure software requires a range of approaches to reason about systems correctness. Alongside industry-standard testing methods (such as unit and integration testing), AWS has adopted model checking, fuzzing, property-based testing, ...

Good quote:
Programming language design is kind of just the unbelievably difficult task of tricking programmers into using formal methods tools, by presenting them in an ergonomic way that doesn’t obstruct how people are used to writing software in a given era. Necessarily each push forward rules out more and more common sources of bugs.
lobste.rs/s/ctpdvb/vine_progra
#plt #typespl #formalmethods #logic

lobste.rsVine: A programming language based on Interaction Nets | Lobsters

From our colleagues over at @fme: "FME’s Teaching Committee has recently organised a special issue of Formal Aspects of Computing that puts forward different perspectives on why and how Formal Methods should be represented in Computer Science curricula.

A summary article of the discussed perspectives appeared in the December 2024 issue of ACM Inroads: The Role of Formal Methods in Computer Science Education."

fmeurope.org/2024/10/02/does-e

www.fmeurope.orgDoes every Computer Scientist need to know Formal Methods? · Formal Methods Europe