Skip to content

parser: fix CRLF(\r\n) file parsing error, SyntaxError: 'invalid syntax' #219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 14, 2023

Conversation

wetor
Copy link
Contributor

@wetor wetor commented Mar 12, 2023

The python file on the Windows platform may be in CRLF format. For example, git may convert the file to CRLF under Windows, resulting in parsing failure

@ncw
Copy link
Collaborator

ncw commented Mar 13, 2023

The fix looks reasonable to me.

Can you link some docs showing .py files can end CR LF please?

@codecov
Copy link

codecov bot commented Mar 13, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (337df2a) 74.41% compared to head (42f20b7) 74.42%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #219   +/-   ##
=======================================
  Coverage   74.41%   74.42%           
=======================================
  Files          76       76           
  Lines       12673    12675    +2     
=======================================
+ Hits         9431     9433    +2     
  Misses       2567     2567           
  Partials      675      675           
Impacted Files Coverage Δ
parser/lexer.go 89.07% <100.00%> (+0.04%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@wetor
Copy link
Contributor Author

wetor commented Mar 13, 2023

here https://docs.python.org/3.5/reference/lexical_analysis.html
of course, we can also use "build tag" to do special processing on the Windows platform

@ncw
Copy link
Collaborator

ncw commented Mar 14, 2023

The docs you linked are quite clear

A physical line is a sequence of characters terminated by an end-of-line sequence. In source files, any of the standard platform line termination sequences can be used - the Unix form using ASCII LF (linefeed), the Windows form using the ASCII sequence CR LF (return followed by linefeed), or the old Macintosh form using the ASCII CR (return) character. All of these forms can be used equally, regardless of platform.

So this should be done on any platform, so we don't need build tags.

Also, theoretically, we should be looking at CR terminated lines. However that is a bigger change to the lexer, so let's not do that until someone requests it as I don't think macs have used CR terminated files for a very long time.

I'll merge this now - thank you :-)

@ncw ncw merged commit 652daef into go-python:main Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants