Skip to content

UI: Detect and restore encoding and BOM in content #6727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 26, 2019

Conversation

zeripath
Copy link
Contributor

@zeripath zeripath commented Apr 23, 2019

Fixes #6716

When decoding content, if the first 3 bytes match the UTF-8 BOM remove it.

When updating content through the editor, check the previous content for the encoding and BOM and reencode to that. If we can't encode to that then default to utf8.

Signed-off-by: Andrew Thornton [email protected]

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath added type/bug topic/ui Change the appearance of the Gitea UI labels Apr 23, 2019
@zeripath zeripath added this to the 1.9.0 milestone Apr 23, 2019
@zeripath
Copy link
Contributor Author

zeripath commented Apr 23, 2019

This could be easily backported to 1.8 less easy to backport back to 1.8 now because of the reencode step but still possible.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Apr 23, 2019
@codecov-io
Copy link

codecov-io commented Apr 23, 2019

Codecov Report

Merging #6727 into master will decrease coverage by 0.02%.
The diff coverage is 24.44%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #6727      +/-   ##
=========================================
- Coverage   41.03%     41%   -0.03%     
=========================================
  Files         421     421              
  Lines       57967   58050      +83     
=========================================
+ Hits        23784   23804      +20     
- Misses      31024   31078      +54     
- Partials     3159    3168       +9
Impacted Files Coverage Δ
modules/repofiles/update.go 39.47% <23.37%> (-5.27%) ⬇️
modules/templates/helper.go 48.66% <25%> (-0.3%) ⬇️
modules/base/tool.go 72.26% <40%> (-0.42%) ⬇️
models/gpg_key.go 55.83% <0%> (-0.84%) ⬇️
modules/log/event.go 65.98% <0%> (+1.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4c34bc1...3dbc3b5. Read the comment docs.

@zeripath zeripath changed the title UI: Detect and remove a decoded BOM UI: Detect BOM and restore encoding and BOMs on updates Apr 23, 2019
@silverwind
Copy link
Member

Tested this, can confirm it's working well. BOM is not rendered anymore and preserved if present when editing the file in the web editor.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Apr 25, 2019
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Apr 25, 2019
@zeripath
Copy link
Contributor Author

Should I provide a backport?

@lafriks
Copy link
Member

lafriks commented Apr 26, 2019

Yes please do so

@@ -267,6 +267,10 @@ func ToUTF8WithErr(content []byte) (string, error) {
if err != nil {
return "", err
} else if charsetLabel == "UTF-8" {
if len(content) > 2 && bytes.Equal(content[0:3], base.UTF8BOM) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about to create a function named RemoveUTF8BOM(content string) string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@lunny
Copy link
Member

lunny commented Apr 26, 2019

@zeripath see my comment.

@zeripath
Copy link
Contributor Author

I also noticed that I wasn't dealing with updating LFSed content so that's done too.

@zeripath
Copy link
Contributor Author

(well it's dealt with in the same way we deal with it on the front end.)

@zeripath
Copy link
Contributor Author

zeripath commented Apr 26, 2019

Don't merge - I've just noticed that the LFS stuff is slightly wrong. Fixed

@zeripath
Copy link
Contributor Author

OK fixed!

zeripath added a commit to zeripath/gitea that referenced this pull request Apr 26, 2019
Detect and remove a decoded BOM when showing content.
Restore the previous encoding and BOM when updating content.
On error keep as UTF-8 encoding.

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath changed the title UI: Detect BOM and restore encoding and BOMs on updates UI: Detect and restore encoding and BOM in content Apr 26, 2019
@lafriks lafriks merged commit f6eedd4 into go-gitea:master Apr 26, 2019
@lafriks lafriks added the backport/done All backports for this PR have been created label Apr 26, 2019
techknowlogick pushed a commit that referenced this pull request Apr 27, 2019
Detect and remove a decoded BOM when showing content.
Restore the previous encoding and BOM when updating content.
On error keep as UTF-8 encoding.

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath deleted the fix-#6716 branch May 2, 2019 18:46
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/done All backports for this PR have been created lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. topic/ui Change the appearance of the Gitea UI type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix Byte Order Mark (BOM) handling in markdown display and editor.
7 participants