Commit Graph

44 Commits

Author SHA1 Message Date
Aleksey Kladov f7be59c593 Introduce expect snapshot testing library into rustc
Snapshot testing is a technique for writing maintainable unit tests.
Unlike usual `assert_eq!` tests, snapshot tests allow
to *automatically* upgrade expected values on test failure.
In a sense, snapshot tests are inline-version of our beloved
UI-tests.

Example:

![expect](https://user-images.githubusercontent.com/1711539/90888810-3bcc8180-e3b7-11ea-9626-d06e89e1a0bb.gif)

A particular library we use, `expect_test` provides an `expect!`
macro, which creates a sort of self-updating string literal (by using
`file!` macro). Self-update is triggered by setting `UPDATE_EXPECT`
environmental variable (this info is printed during the test failure).
This library was extracted from rust-analyzer, where we use it for
most of our tests.

There are some other, more popular snapshot testing libraries:

* https://github.com/mitsuhiko/insta
* https://github.com/aaronabramov/k9

The main differences of `expect` are:

* first-class snapshot objects (so, tests can be written as functions,
  rather than as macros)
* focus on inline-snapshots (but file snapshots are also supported)
* restricted feature set (only `assert_eq` and `assert_debug_eq`)
* no extra runtime (ie, no `cargo insta`)

See https://github.com/rust-analyzer/rust-analyzer/pull/5101 for a
an extended comparison.

It is unclear if this testing style will stick with rustc in the long
run. At the moment, rustc is mainly tested via integrated UI tests.
But in the library-ified world, unit-tests will become somewhat more
important (that's why use use `rustc_lexer` library-ified library as
an example in this PR). Given that the cost of removal shouldn't be
too high, it probably makes sense to just see if this flies!
2020-08-24 15:38:42 +02:00
bors b51651ae9d Auto merge of #75642 - matklad:lexer-comments, r=petrochenkov
Move doc comment parsing to rustc_lexer

Plain comments are trivia, while doc comments are not, so it feels
like this belongs to the rustc_lexer.

The specific reason to do this is the desire to use rustc_lexer in
rustdoc for syntax highlighting, without duplicating "is this a doc
comment?" logic there.

r? @ghost
2020-08-21 06:05:39 +00:00
Aleksey Kladov 52992979c5 Rename rustc_lexer::TokenKind::Not to Bang
All other tokens are named by the punctuation they use, rather than
by semantics operation they stand for. `!` is the only exception to
the rule, let's fix it.
2020-08-20 16:55:19 +02:00
Aleksey Kladov 39197e673e Move doc comment parsing to rustc_lexer
Plain comments are trivial, while doc comments are not, so it feels
like this belongs to the rustc_lexer.

The specific reason to do this is the desire to use rustc_lexer in
rustdoc for syntax highlighting, without duplicating "is this a doc
comment?" logic there.
2020-08-19 22:53:16 +02:00
Vadim Petrochenkov 20c5044465 Introduce `rustc_lexer::is_ident` and use it in couple of places 2020-08-11 00:08:04 +03:00
Manish Goregaokar 3f90287bb6
Rollup merge of #73856 - pierwill:pierwill-lexer-doc, r=jonas-schievink
Edit librustc_lexer top-level docs

Minor edit, and adds link to librustc_parse::lexer.
2020-07-06 17:45:17 -07:00
pierwill 36e50a0fb3 Edit librustc_lexer top-level docs
Add link to librustc_parse::lexer
2020-07-06 16:01:47 -07:00
pierwill 49c1018d13 Fix markdown rendering in librustc_lexer docs
Use back-ticks instead of quotation marks in docs for the block comment
variant of TokenKind.
2020-06-28 13:19:59 -07:00
Vadim Petrochenkov 7b2064f4f9 rustc_lexer: Simplify shebang parsing once more 2020-06-26 19:52:19 +03:00
Matthias Krüger 58023fedfc Fix more clippy warnings
Fixes more of:

clippy::unused_unit
clippy::op_ref
clippy::useless_format
clippy::needless_return
clippy::useless_conversion
clippy::bind_instead_of_map
clippy::into_iter_on_ref
clippy::redundant_clone
clippy::nonminimal_bool
clippy::redundant_closure
clippy::option_as_ref_deref
clippy::len_zero
clippy::iter_cloned_collect
clippy::filter_next
2020-06-09 18:51:08 +02:00
Julian Wollersberger 5fbbfbbfa9 Simplify raw string error reporting.
This makes `UnvalidatedRawStr` and `ValidatedRawStr` unnecessary and removes 70 lines.
2020-06-01 22:01:19 +02:00
Vadim Petrochenkov 21755b58c9 rustc_lexer: Optimize shebang detection slightly 2020-05-29 22:55:58 +03:00
Russell Cohen a93d31603f Fix bug in shebang handling
Shebang handling was too agressive in stripping out the first line in cases where it is actually _not_ a shebang, but instead, valid rust (#70528). This is a second attempt at resolving this issue (the first attempt was flawed, for, among other reasons, causing an ICE in certain cases (#71372, #71471).

The behavior is now codified by a number of UI tests, but simply:
For the first line to be a shebang, the following must all be true:
1. The line must start with `#!`
2. The line must contain a non whitespace character after `#!`
3. The next character in the file, ignoring comments & whitespace must not be `[`

I believe this is a strict superset of what we used to allow, so perhaps a crater run is unnecessary, but probably not a terrible idea.
2020-05-25 10:11:08 -04:00
Julian Wollersberger e734e31340 Small doc improvements.
The phrasing is from the commit description of 395ee0b79f by @Matklad.
2020-05-09 13:46:03 +02:00
Eduard-Mihai Burtescu 4d67c8da55 Revert "Rollup merge of #71372 - ayushmishra2005:shebang_stripping, r=estebank"
This reverts commit 46a8dcef5c, reversing
changes made to f28e3873c5.
2020-04-28 13:02:58 +03:00
Ayush Kumar Mishra 1b362cd1d5 Minor refactoring 2020-04-21 22:29:20 +05:30
Ayush Kumar Mishra ee5a2120f9 Refactoring and added test-cases #70528 2020-04-21 16:48:58 +05:30
Ayush Kumar Mishra 0315864260 Fix #! (shebang) stripping account space issue #70528 2020-04-21 11:44:00 +05:30
Russell Cohen f543689eb6 Handle unterminated raw strings with no #s properly
The modified code to handle parsing raw strings didn't properly account for the case where there was no "#" on either end and erroneously reported this strings as complete. This lead to a panic trying to read off the end of the file.
2020-04-02 01:02:55 -04:00
Russell Cohen 20e21902bb Clean up redudant conditions and match exprs 2020-03-30 12:39:40 -04:00
Russell Cohen c15f86b4b3 Cleanup error messages, improve docstrings 2020-03-29 11:12:48 -04:00
Russell Cohen 629e97a5a0 Improve error messages for raw strings (#60762)
This diff improves error messages around raw strings in a few ways:
- Catch extra trailing `#` in the parser. This can't be handled in the lexer because we could be in a macro that actually expects another # (see test)
- Refactor & unify error handling in the lexer between ByteStrings and RawByteStrings
- Detect potentially intended terminators (longest sequence of "#*" is suggested)
2020-03-29 00:43:43 -04:00
Matthias Krüger ad00e91887 remove redundant returns (clippy::needless_return) 2020-03-20 20:23:03 +01:00
Drew Ripberger 026dec5500
Spelling error "represening" to "representing" 2020-02-13 11:14:21 -05:00
Mark Rousskov a06baa56b9 Format the world 2019-12-22 17:42:47 -05:00
Mazdak Farrokhzad 4ae2728fa8 move syntax::parse -> librustc_parse
also move MACRO_ARGUMENTS -> librustc_parse
2019-11-10 03:57:18 +01:00
Igor Aleksanov e8b8d2a725 librustc_lexer: Reorder imports in lib.rs 2019-11-04 06:27:25 +03:00
Igor Aleksanov ecd26739d4 librustc_lexer: Simplify "lifetime_or_char" method 2019-11-04 06:27:18 +03:00
Igor Aleksanov 6e350bd999 librustc_lexer: Simplify "raw_double_quoted_string" method 2019-11-03 12:55:50 +03:00
Igor Aleksanov d6f722d79c librustc_lexer: Simplify "double_quoted_string" method 2019-11-03 12:55:05 +03:00
Igor Aleksanov 649a5247f5 librustc_lexer: Simplify "single_quoted_string" method 2019-11-03 12:54:23 +03:00
Igor Aleksanov e0c45f7ee7 librustc_lexer: Make "eat_float_exponent" return bool instead of result 2019-11-03 11:43:47 +03:00
Igor Aleksanov 72767a8056 librustc_lexer: Introduce "eat_while" and "eat_identifier" methods 2019-11-03 11:42:08 +03:00
Igor Aleksanov 0825b357d8 librustc_lexer: Add methods "first" and "second" to the "Cursor" 2019-11-03 11:39:39 +03:00
Igor Aleksanov 993b920032 librustc_lexer: Enhance documentation
Apply review suggestions

Apply review suggestions
2019-10-27 20:08:08 +03:00
Aleksey Kladov 206fe8e1c3 flatten rustc_lexer::character_properties module
On the call site, `rustc_lexer::is_whitespace` reads much better than
`character_properties::is_whitespace`.
2019-09-04 15:13:29 +03:00
Aleksey Kladov a0c186c34f remove XID and Pattern_White_Space unicode tables from libcore
They are only used by rustc_lexer, and are not needed elsewhere.

So we move the relevant definitions into rustc_lexer (while the actual
unicode data comes from the unicode-xid crate) and make the rest of
the compiler use it.
2019-09-04 13:11:11 +03:00
Aleksey Kladov 8b932dfda7 remove composite tokens support from the lexer 2019-08-19 21:59:09 +03:00
Aleksey Kladov 911398b96c remove special handling of \r\n from the lexer 2019-08-14 16:38:40 +03:00
Mark Rousskov f11ffd3a6a
Rollup merge of #62869 - matklad:feature-gate, r=Mark-Simulacrum
add rustc_private as a proper language feature gate

At the moment, `rustc_private` as a (library) feature exists by
accident: `char::is_xid_start`, `char::is_xid_continue` methods in
libcore define it.

cc https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/How.20to.20declare.20new.20langauge.20feature.3F

I don't know if this is at all reasonable, but at least tests seem to pass locally. That probably means that we can remove/rename to something more resonable the feature in libcore in the next release?
2019-07-23 12:51:18 -04:00
Aleksey Kladov 7e612c19be
Update src/librustc_lexer/src/lib.rs
Co-Authored-By: Ralf Jung <post@ralfj.de>
2019-07-23 10:38:18 +03:00
Aleksey Kladov 27b703dd40 add rustc_private as a proper language feature gate
At the moment, `rustc_private` as a (library) feature exists by
accident: `char::is_xid_start`, `char::is_xid_continue` methods in
libcore define it.
2019-07-22 16:32:13 +03:00
Aleksey Kladov e63fe150bf move unescape module to rustc_lexer 2019-07-21 16:46:11 +03:00
Aleksey Kladov 395ee0b79f Introduce rustc_lexer
The idea here is to make a reusable library out of the existing
rust-lexer, by separating out pure lexing and rustc-specific concerns,
like spans, error reporting an interning.

So, rustc_lexer operates directly on `&str`, produces simple tokens
which are a pair of type-tag and a bit of original text, and does not
report errors, instead storing them as flags on the token.
2019-07-20 21:12:34 +03:00