By ananvodo


2018-07-11 23:03:23 8 Comments

I have a document that I opened using python:

with open('my_file.txt', 'r') as fin:
    myfile = fin.readlines()

Inside myfile I have lines like this:

     1HEE     JJ    1   3.904   5.512   1.259\n
     2HEE    CJJ    2   4.199   5.292   1.353\n
     2LLO    SJJ    3   4.367   5.234   1.445\n
     3LLO     JJ    4   4.041   4.969   1.220\n
   6.50000   6.50000  6.50000\n
 This is some other title.\n
 3\n
     1GOO    HSC    1   4.088   4.816   1.041\n
     1DDD      H    2   9.018   0.828   7.094\n
     2DDD      H    3  19.018   0.828   7.094\n

The ONLY lines that I need to keep are these ones:

     1HEE     JJ    1   3.904   5.512   1.259\n
     2HEE    CJJ    2   4.199   5.292   1.353\n
     2LLO    SJJ    3   4.367   5.234   1.445\n
     3LLO     JJ    4   4.041   4.969   1.220\n
     1GOO    HSC    1   4.088   4.816   1.041\n
     1DDD      H    2   9.018   0.828   7.094\n
     2DDD      H    3  19.018   0.828   7.094\n

In other words, the lines that have information from information in:

myfile[line][:44] I MUST keep.

The other lines (the shorter lines) I must DELETE.

Any ideas on how to do this?

2 comments

@Joooeey 2018-07-11 23:51:52

If the lines that have to be dropped always have less than 44 characters, and the valid ones always have at least 44 (like you say), you can just do:

with open('input.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        for line in infile:
            if len(line) >= 44:
                outfile.write(line)

@Scott Anderson 2018-07-11 23:25:57

Perhaps I use regex as a go-to too much, but the re module seems perfect here given you want to identify a line string pattern: lines identified by the data within. If you don't know it already the re module uses 'perl' syntax string-matching patterns, see the docs here

You can test a regex you build online using a tool such as regex 101

If you are trying to identify a line such as | 1HEE JJ 1 3.904 5.512 1.259 | I would write a regex as something like: ^\| +\w{4} +\w+ +\d +\d.\d{3} +\d.\d{3} +\d.\d{3} +\| (try it in Regex 101. Please note that this pattern makes some assumptions about the actual string pattern based upon what is given in the example.

Using a method from the re module such as .findall and this pattern you should be able to gather all lines which follow your desired format. To clarify when pattern matching using the .findall method you would end up with all lines matching the desired format as a list of strings in your program that you could manipulate as you please (including making a new text file with that only has harvested data).

@ananvodo 2018-07-11 23:28:59

is there a way to say: if line[:44] has info then keep else delete?

@Scott Anderson 2018-07-11 23:32:12

I am not sure what you mean: 1st of all do you mean line number 44 (and not the 44th match of the regex pattern)?

@ananvodo 2018-07-11 23:43:02

when you read a file you do while open("my.txt", "r") as fin: myfile = fin.readlines(). Then you reference myfile[line number][position in line]. The lines I need to keep are myfile[line][:44].

@Scott Anderson 2018-07-11 23:55:14

That is a slightly different topic, but you should be able to do it just by looping through range(44) for each iteration using <file_object>.readline() and appending it to a string variable which accumulates all lines, e.g: txt_file = open("Joke1.txt", "r"); line_list = []; for i in range(2); line_list.append(txt_file.readline()). This works because of file pointers, which is a longer discussion

Related Questions

Sponsored Content

10 Answered Questions

10 Answered Questions

[SOLVED] Delete a file or folder

32 Answered Questions

[SOLVED] How do I list all files of a directory?

  • 2010-07-08 19:31:22
  • duhhunjonn
  • 2574008 View
  • 2925 Score
  • 32 Answer
  • Tags:   python directory

39 Answered Questions

[SOLVED] How do I check whether a file exists?

  • 2008-09-17 12:55:00
  • spence91
  • 3035781 View
  • 4478 Score
  • 39 Answer
  • Tags:   python file

14 Answered Questions

[SOLVED] How do you return multiple values in Python?

9 Answered Questions

34 Answered Questions

[SOLVED] In Python, how do I read a file line-by-line into a list?

19 Answered Questions

7 Answered Questions

19 Answered Questions

[SOLVED] Python progression path - From apprentice to guru

  • 2010-04-04 00:28:33
  • Morlock
  • 331474 View
  • 659 Score
  • 19 Answer
  • Tags:   python

Sponsored Content