From patchwork Fri Oct 8 19:09:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Sixt X-Patchwork-Id: 12546227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA182C433EF for ; Fri, 8 Oct 2021 19:10:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AE65360F6E for ; Fri, 8 Oct 2021 19:10:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240598AbhJHTMA (ORCPT ); Fri, 8 Oct 2021 15:12:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231459AbhJHTL4 (ORCPT ); Fri, 8 Oct 2021 15:11:56 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42B68C061762 for ; Fri, 8 Oct 2021 12:10:00 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id e12so32672400wra.4 for ; Fri, 08 Oct 2021 12:10:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=bmuO/jMfblS6AyGnS/YriWnEGzghNWl6SYQCUtVvYAM=; b=cfXO/TJ9lsPk5dQ+ePL+WbDTO3kj8rKBmg7FwBl4vt5Ru7RgtEaMZtxiVsP/5nmrVN roTCz2LMkHZWCVWDNXZSOq0B3vP+iH/pae0nu4pQYN5qlPRA+o1+E1v6CLpSx0BidQi1 fMHqsF5+F9FbarjaQVUkPNuTosprenZV2WrH5IuyNLBiv+LjkDxa8Bd5+7qpExXrdQLD XZ0BfwdsiNJhrgvivpK8leNvSR6ng1I7TMdswb2F+4EJYaMnM51zSL5G24lp+ZlB6BUu DZxdWZ8Za0dNah0c9QpJh5d+xOX/StJdB0UUR/IjFMI3t4t3NKkc/rdBX0zsAcKkMOwU UInA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=bmuO/jMfblS6AyGnS/YriWnEGzghNWl6SYQCUtVvYAM=; b=LbcKRBiZQI0wIet3Bf+vsaXmHTBfnqK4HCj7Ll6ILQE+lk76fet0nHnmOKVR1hZt0v Hrng4qoL7QNohRHSJcZXXE8MHF0RyD86wrj92cBtIZQujv42spOtesWgdrtAljHSbn/G tVtwd581pQsTOhlPtLgp767OGEHJGYTxTlwQS0UTGjGf0O09VxCg2YV5qfyr/DVF7qlx huzzgOKovanbj44gwI7JImT99+Q4jK3MNeK8jMDS07p0ClroexGueg1atqy9fxIarpBp 3XqfNuozHrvTugsGWQkHVyLKAsOWUhI/GSCp1XiONlCYqhOLAbFi59Ovf6DOGk8/bg/f oitw== X-Gm-Message-State: AOAM533S0y7wJVOxgXaGtDcQnM7z2YRNX88qRacUQxyVmB9xNdcU5Il5 34+AnxWjoQ5q1O5Cog1/0rMIuJNOjXw= X-Google-Smtp-Source: ABdhPJwF07DCKYxjEUeslsxrqr0qeAGzt5gzM+TXtR8M3+tKOZ+ye0oUwyGO6i675IsrMVNKapgZrQ== X-Received: by 2002:a05:6000:2a4:: with SMTP id l4mr6460288wry.221.1633720198950; Fri, 08 Oct 2021 12:09:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 61sm149499wrl.94.2021.10.08.12.09.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Oct 2021 12:09:58 -0700 (PDT) Message-Id: <5a84fc9cf715aec258d9cda2dd7d2e8eff2dc66c.1633720197.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 08 Oct 2021 19:09:54 +0000 Subject: [PATCH v2 2/5] t4034: add tests showing problematic cpp tokenizations Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Johannes Sixt , Johannes Sixt Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Johannes Sixt From: Johannes Sixt The word regex is too loose and matches long streaks of characters that should actually be separate tokens. Add these problematic test cases. Separate the lines with text that will remain identical in the pre- and post-image so that the diff algorithm will not lump removals and additions of consecutive lines together. This makes the expected output easier to read. Signed-off-by: Johannes Sixt --- t/t4034/cpp/expect | 22 ++++++++++++++++++---- t/t4034/cpp/post | 18 ++++++++++++++++-- t/t4034/cpp/pre | 16 +++++++++++++++- 3 files changed, 49 insertions(+), 7 deletions(-) diff --git a/t/t4034/cpp/expect b/t/t4034/cpp/expect index 41976971b93..63e53a61e62 100644 --- a/t/t4034/cpp/expect +++ b/t/t4034/cpp/expect @@ -1,11 +1,25 @@ diff --git a/pre b/post -index c5672a2..4229868 100644 +index 1229cdb..3feae6f 100644 --- a/pre +++ b/post -@@ -1,16 +1,16 @@ -Foo() : x(0&&1&42) { bar(x); } +@@ -1,30 +1,30 @@ +Foo() : x(0&&1&42) { foo0bar(x.f.Find); } cout<<"Hello World!?\n"<(1) (-1e10) (0xabcdef) 'xy' +(1 -1e10+1e10 0xabcdef) 'xy' +// long double +3.141592653e-10l3.141592654e+10l +// float +120E5fE6f +// hex +0xdeadbeaf+80xdeadBeaf+7ULL +// octal +0123456701234560 +// binary +0b10000b1100+e1 +// expression +1.5-e+2+f1.5-e+3+f +// another one +str.e+65.e+75 [a] b->->*v d.e.*e ~!a !~b c+++ d--- e**f g&&&h a**=b c//=d e%%=f diff --git a/t/t4034/cpp/post b/t/t4034/cpp/post index 4229868ae62..3feae6f430f 100644 --- a/t/t4034/cpp/post +++ b/t/t4034/cpp/post @@ -1,6 +1,20 @@ -Foo() : x(0&42) { bar(x); } +Foo() : x(0&42) { bar(x.Find); } cout<<"Hello World?\n"<*v d.*e ~!a !~b c+ d- e**f g&&h a*=b c/=d e%=f diff --git a/t/t4034/cpp/pre b/t/t4034/cpp/pre index c5672a24cfc..1229cdb59d1 100644 --- a/t/t4034/cpp/pre +++ b/t/t4034/cpp/pre @@ -1,6 +1,20 @@ -Foo():x(0&&1){} +Foo():x(0&&1){ foo0( x.find); } cout<<"Hello World!\n"<v d.e !a ~b c++ d-- e*f g&h a*b c/d e%f