Swift Parser Combinator Framework

帕西

Swift Parser Combinator Framework

除了简单的组合器之外,Parsey 还支持源位置/范围跟踪、
回溯预防和自定义错误消息。

帕西游乐场

特征

  • 组合器接口

    • |、、、、 组合子运算符~~~~><~~^^
  • 词法分析原语:

    • Lexer.whitespace, , ...Lexer.signedInteger
  • 类似正则表达式的组合器:

    • 的后缀。.+.many()
      • 例:let arrayLiteral = "[" ~~> expression.+ <~~ "]"
    • 的后缀。.*.manyOrNone()
      • 例:let classDef = (attribute | method).*
    • 的后缀。.?.optional()
      • 例:let declaration = "let" ~~> id ~~ (":" ~~> type).? ~~ ("=" ~~> expression)
    • 的后缀。+.manyConcatenated()
      • 例:let skippedSpaces = (Lexer.space | Lexer.tab)+
    • 的中缀。+.concatenatingResult(with:)
      • 例:let type = Lexer.upperLetter + Lexer.letter*
    • Lexer.regex(_:)用于直接应用正则表达式。
      • 例:let id = Lexer.regex("[a-zA-Z][a-zA-Z0-9]*")
  • 防止回溯

    • .!后缀运算符或.nonbacktracking()
  • 错误消息的分析器标记

    • <!--运算符或.tagged(_:)
  • 包含源位置的丰富错误消息

    • 例如:
    Parse failure at 2:4 ----
    (+ %% 1 -20) 2 3)
       ^~~~~~~~~~~~~~
    Expecting an expression, but found "%"
    
  • Source range tracking

    • ^^^ operator or .mapParse(_:)
    • For example, S-expression gets parsed to
      the following range-tracked AST:
      \n(+ \n\n(+ +1 -20) 2 3)
    Expr:(2:1..<4:16):[
        ID:(2:2..<2:3):+,
        Expr:(4:1..<4:11):[
            ID:(4:2..<4:3):+,
            Int:(4:4..<4:6):1,
            Int:(4:7..<4:10):-20],
        Int:(4:12..<4:13):2,
        Int:(4:14..<4:15):3]
    

Requirements

  • Swift 3

  • Any operating system

Package

To use it in your Swift project, add the following dependency to your
Swift package description file.

    .Package(url: "https://github.com/rxwei/Parsey", majorVersion: 1)
Swift

⚙ Examples

0️⃣ An LLVM Compiler Frontend written in Swift using Parsey

The COOL Programming Language

1️⃣ Parse Left-associative Infix Expressions with Operator Precedence

indirect enum Expression {
    case integer(Int)
    case symbol(String)
    case infix(String, Expression, Expression)
}

enum Grammar {
    static let integer = Lexer.signedInteger
        ^^ {Int($0)!} ^^ Expression.integer

    static let symbol = Lexer.regex("[a-zA-Z][0-9a-zA-Z]*")
        ^^ Expression.symbol

    static let addOp = Lexer.anyCharacter(in: "+-")
        ^^ { op in { Expression.infix(op, $0, $1) } }
    
    static let multOp = Lexer.anyCharacter(in: "*/")
        ^^ { op in { Expression.infix(op, $0, $1) } }

    /// Left-associative multiplication
    static let multiplication = (integer | symbol).infixedLeft(by: multOp)

    /// Left-associative addition
    static let addition = multiplication.infixedLeft(by: addOp)

    static let expression: Parser<Expression> = addition
}

try print(Grammar.expression.parse("2"))
/// Output:
/// Expression.integer(2)

try print(Grammar.expression.parse("2+1+2*a"))
/// Output:
/// Expression.infix("+",
///                  .infix("+", .integer(2), .integer(1)),
///                  .infix("*", .integer(2), .symbol("a")))
Swift

2️⃣ Parse S-Expressions

indirect enum Expr {
    case sExp([Expr])
    case int(Int)
    case id(String)
}

enum Grammar {
    static let whitespaces = (Lexer.space | Lexer.tab | Lexer.newLine)+
    static let anInt = Lexer.signedInteger ^^ { Int($0)! } ^^ Expr.int
    static let anID = Lexer.regex("[a-zA-Z_+\\-*/][0-9a-zA-Z_+\\-*/]*") ^^ Expr.id
    static let aSExp: Parser<Expr> =
        "(" ~~> (anExp.!).many(separatedBy: whitespaces).amid(whitespaces.?) <~~ ")"
        ^^ Expr.sExp
    static let anExp = anInt | anID | aSExp <!-- "an expression"
}

/// Success
try Grammar.anExp.parse("(+ (+ 1 -20) 2 3)")
/// Output: Expr.sExp(...)

/// Failure
try Grammar.anExp.parse("(+ \n(+ %% 1 -20) 2 3)")
/// Output: Parse failure at 2:4 ----
///         (+ %% 1 -20) 2 3)
///            ^~~~~~~~~~~~~~
///         Expecting an expression, but found "%"
Swift

3️⃣ Parse S-Expressions with Source Range Tracking

indirect enum Expr {
    case sExp([Expr], SourceRange)
    case int(Int, SourceRange)
    case id(String, SourceRange)
}

enum Grammar {
    static let whitespaces = (Lexer.space | Lexer.tab | Lexer.newLine)+

    static let anInt = Lexer.signedInteger 
        ^^^ { Expr.int(Int($0.target)!, $0.range) }

    static let anID = Lexer.regex("[a-zA-Z_+\\-*/][0-9a-zA-Z_+\\-*/]*")
        ^^^ { Expr.id($0.target, $0.range) }

    static let aSExp: Parser<Expr> =
        "(" ~~> (anExp.!).many(separatedBy: whitespaces).amid(whitespaces.?) <~~ ")"
        ^^^ { Expr.sExp($0.target, $0.range) }

    static let anExp = anInt | anID | aSExp <!-- "an expression"
}

/// Success
try Grammar.anExp.parse("(+ (+ 1 -20) 2 3)")
/// Output: Expr.sExp(...)

/// Failure
try Grammar.anExp.parse("(+ \n(+ %% 1 -20) 2 3)")
/// Output: Parse failure at 2:4 ----
///         (+ %% 1 -20) 2 3)
///            ^~~~~~~~~~~~~~
///         Expecting an expression, but found "%"
Swift

Dependency

License

MIT License

GitHub

https://github.com/rxwei/Parsey