python - Deleting specific text from text file -


i have text files text file

>e8|e2|e9d football game health can play every day >e8|e2|e10d sequence unavailable >e8|e2|ekb cricket 

i wrote following code detecting sequence unavailable text file , write in new text file

lastline = none open('output.txt', 'w') w:     open('input.txt', 'r') f:         line in f.readlines():             if not lastline:                 lastline = line.rstrip('\n')                 continue             if line.rstrip('\n') == 'sequence unavailable':                 _, _, id = lastline.split('|')                 data= 'sequence unavailable|' + id                 w.write(data)                 w.write('\n')             lastline = none 

it work fine , detect sequence unavailabe text file , write in new file , want delete file read like

input.txt

>e8|e2|e9d football game health can play every day >e8|e2|e10d sequence unavailable >e8|e2|ekb cricket 

input after code should this

>e8|e2|e9d football game health can play every day >e8|e2|ekb cricket 

here not using file.readlines method, fetches lines file list. so, not memory efficient.

method 1: using temporary file.

import os open('input.txt') f1, open('output.txt', 'w') f2,\                                                   open('temp_file','w') f3:     lines = []       # store lines between 2 `>` in list     line in f1:         if line.startswith('>'):             if lines:                 f3.writelines(lines)                 lines = [line]             else:                 lines.append(line)         elif line.rstrip('\n') == 'sequence unavailable':             f2.writelines(lines + [line])             lines = []         else:             lines.append(line)      f3.writelines(lines)  os.remove('input.txt') os.rename('temp_file', 'input.txt') 

demo:

$ cat input.txt >e8|e2|e9d football game health can play every day >e8|e2|e10d sequence unavailable >e8|e2|ekb cricket  $ python so.py  $ cat input.txt >e8|e2|e9d football game health can play every day >e8|e2|ekb cricket $ cat output.txt >e8|e2|e10d sequence unavailable 

for generating temp file can use tempfile module.

method 2: fileinput module

no need of temp file method:

import fileinput open('output.txt', 'w') f2:     lines = []     line in fileinput.input('input.txt', inplace = true):         if line.startswith('>'):              if lines:                  print "".join(lines),                  lines = [line]              else:                  lines.append(line)         elif line.rstrip('\n') == 'sequence unavailable':              f2.writelines(lines + [line])              lines = []         else:              lines.append(line)      open('input.txt','a') f:         f.writelines(lines) 

Comments

Popular posts from this blog

curl - PHP fsockopen help required -

HTTP/1.0 407 Proxy Authentication Required PHP -

c# - Resource not found error -