Page 1 of 1

How to check if a .txt file uses line endings for Windows, Unix, or Mac?

Posted: Fri Feb 23, 2024 1:09 am
by marcoagpinto
Heya,

Based on ChatGPT, I came up with some code to check if the line endings of a text file are Windows, Unix, or Mac.

Will this ReadByte command work with UTF-8 for letters/symbols that use two bytes?

Thanks!

Code: Select all

Procedure words_missing_in_master_wordlist_import_and_process_pre_count_words_in_file(file$)
  
  ; Refresh the gadgets since it froze their refreshment
  GadgetsRefresh()
  
  
  Debug "file$:"+file$
  
  ; Load the .txt file
  ReadFile(1,file$)
    ReadStringFormat(1)
    
    location_of_file.q=Loc(1)
 
    ; Check the line ending, Windows, Unix, Mac
    file_ending$=""
    Repeat
      t$=Chr(ReadByte(1))
      If t$=#CR$
        If Chr(ReadByte(1))=#LF$ : t$+#LF$ : EndIf
      EndIf
      If t$=#CRLF$
        Debug "End of line: Windows (CR+LF)"
        file_ending$=#CRLF$
      ElseIf t$=#LF$
        Debug "End of line: Unix (LF)"
        file_ending$=#LF$
      ElseIf t$=#CR$
        Debug "End of line: Mac (CR)"
        file_ending$=#CR$
      EndIf
    Until Eof(1) Or file_ending$<>""
      
    FileSeek(1,location_of_file.q)
    
    t$=ReadString(1,#PB_UTF8|#PB_File_IgnoreEOL) 
  CloseFile(1)
  
  ; Convert to Unix
;   ConvertStringToUnix(t$)
  
  ; Count number of words
  counter=CountString(t$,file_ending$)
  If Right(t$,Len(file_ending$))<>file_ending$ : counter+1 : EndIf
  
  Debug "counter:"+Str(counter)
  
  ; Return the number of words
  ProcedureReturn counter
  
EndProcedure

Re: How to check if a .txt file uses line endings for Windows, Unix, or Mac?

Posted: Fri Feb 23, 2024 7:39 am
by infratec
Simply test it. :wink:

Btw. this code is ... not good.

Re: How to check if a .txt file uses line endings for Windows, Unix, or Mac?

Posted: Fri Feb 23, 2024 8:03 am
by marcoagpinto
infratec wrote: Fri Feb 23, 2024 7:39 am Simply test it. :wink:

Btw. this code is ... not good.
Could you suggest a better one, please?

Thanks!

Re: How to check if a .txt file uses line endings for Windows, Unix, or Mac?

Posted: Fri Feb 23, 2024 9:00 am
by jacdelad
Read the whole file into one string, check for #CRLF$, if not found check for #CR$, if not found check for #LF$, done.

Can you please stop using ChatGPT for code generation? It simply doesn't work...

Re: How to check if a .txt file uses line endings for Windows, Unix, or Mac?

Posted: Fri Feb 23, 2024 10:45 am
by AZJIO
This code checks the entire file, but you can check just the first newline character, as long as you can ensure that the entire document is styled the same.

Code: Select all

Procedure IsCRLF(text$)
	Protected CR, LF
	LF = CountString(text$, #LF$)
	CR = CountString(text$, #CR$)
	If LF = 0 And CR = 0
		Debug "undefined"
	Else
		If LF = CR
			If LF = CountString(text$, #CRLF$)
				Debug "CRLF (Win)"
			Else
				Debug "mixed"
			EndIf
		ElseIf LF And Not CR
			Debug "LF (Linux)"
		ElseIf CR And Not LF
			Debug "CR (Mac)"
		Else
			Debug "mixed"
		EndIf
	EndIf
EndProcedure

IsCRLF("qwerty")
IsCRLF("qwerty" + #CRLF$ + "qwerty" + #CRLF$ + "qwerty" + #CRLF$)
IsCRLF("qwerty" + #CRLF$ + "qwerty" + #CR$ + "qwerty" + #LF$)
IsCRLF("qwerty" + #CR$ + "qwerty" + #LF$ + "qwerty" + #CR$ + "qwerty" + #LF$)
IsCRLF("qwerty" + #LF$ + "qwerty" + #LF$ + "qwerty" + #LF$)
IsCRLF("qwerty" + #CR$ + "qwerty" + #CR$ + "qwerty" + #CR$)
Determination by first occurrence

Code: Select all

EnableExplicit

Procedure z(x)
	Select x
		Case 1
			Debug "CRLF (Win)"
		Case 2
			Debug "LF (Linux)"
		Case 3
			Debug "CR (Mac)"
		Default 
			Debug "undefined"
	EndSelect
EndProcedure

Procedure IsCRLF(*c.Character)
	Protected *c2.Character
	Protected res
	
	If *c = 0 Or *c\c = 0
		ProcedureReturn 0
	EndIf
	
	While *c\c
		If *c\c = #CR Or *c\c = #LF
			If *c\c = #LF
				res = 2
				Break
			ElseIf *c\c = #CR
				*c + SizeOf(Character)
				If *c\c = #LF
					res = 1
					Break
				Else	
					res = 3
					Break
				EndIf
			EndIf
		EndIf
		*c + SizeOf(Character)
	Wend
	
	ProcedureReturn res
EndProcedure

Define MyStr$

MyStr$ = "qwerty"
z(IsCRLF(@MyStr$))
MyStr$ = "qwerty" + #CRLF$ + "qwerty" + #CRLF$ + "qwerty" + #CRLF$
z(IsCRLF(@MyStr$))
MyStr$ = "qwerty" + #CRLF$ + "qwerty" + #CR$ + "qwerty" + #LF$
z(IsCRLF(@MyStr$))
MyStr$ = "qwerty" + #CR$ + "qwerty" + #LF$ + "qwerty" + #CR$ + "qwerty" + #LF$
z(IsCRLF(@MyStr$))
MyStr$ = "qwerty" + #LF$ + "qwerty" + #LF$ + "qwerty" + #LF$
z(IsCRLF(@MyStr$))
MyStr$ = "qwerty" + #CR$ + "qwerty" + #CR$ + "qwerty" + #CR$
z(IsCRLF(@MyStr$))
Buy a free program from me, it generates code no worse than ChatGPT

This is a demonstration of the work of artificial intelligence. It produces the result, and if you do not understand it, then ask the professionals to figure it out. A professional will rearrange the keywords and lo and behold, the code works. With the same success, you can pour crushed stone near the house and say that it is a garage. All that is needed is the hand of a professional to slightly correct the garage design.

Re: How to check if a .txt file uses line endings for Windows, Unix, or Mac?

Posted: Fri Feb 23, 2024 2:00 pm
by jacdelad
Depending on what you want to do with the file you can separate the lines using a simple regex which automatically recognizes CR/LC/CRLF.