multithreading - Writing a file using multiple threads in java -
i trying write single huge file in java using multiple threads.
i have tried both filewriter & bufferedwriter classes in java.
the content being written entire table (postgres) being read using copymanager & written. each line in file single tuple table & writing 100s of lines @ time.
approach write:
the single to-be-written file opened multiple threads in append mode. each thread thereafter tries writing file file.
following issues face:
once while, contents of file gets overwritten i.e: 1 line remains incomplete & next line starts there itself. assumption here buffers writer getting full. forces writer write data onto file. data written may not complete line & before can write remainder, next thread writes content onto file.
while using filewriter, once while see single black line in file.
any suggestions, how avoid data integrity issue?
shared resource == contention
writing normal file definition serialized operation. gain no performance trying write multiple threads, i/o finite bounded resource @ orders of magnitude less bandwidth slowest or overloaded cpu.
concurrent access shared resource can complicated ( , slow )
if have multiple threads doing expensive calculations have options, if using multiple threads because think going speed up, going opposite. contention i/o slows down access resource, never speeds because of lock waits , other overhead.
you have have critical section protected , allows single writer @ time. source code logging writer supports concurrency , see there single thread writes file.
if application primarily:
cpu bound: can use locking mechanism/data construct let 1 thread out of many write file @ time, useless concurrency standpoint naive solution; if these threads cpu bound little i/o might work.
i/o bound: common case, must use messaging passing system queue of sort , have threads post queue/buffer , have single thread pull , write file. scalable , easiest implement solution.
journaling - async writes
if need create single super large file order of writes unimportant , program cpu bound can use journaling technique.
have each process write separate file , concat multiple files single large file @ end. old school low tech solution works , has decades.
obviously more storage i/o have better perform on end concat.
Comments
Post a Comment