Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(77)

Issue 325430043: Yet another fix

Can't Edit
Can't Publish+Mail
Start Review
Created:
6 years, 6 months ago by w.fakher.reda
Modified:
6 years, 6 months ago
Visibility:
Public.

Description

Yet another fix update R update update Yet another fix Fix Adding ability to compute R timestamps from originals A few updates Trying out another merge function Adding some logs Fix Adding support for average latency computation Small fix Updated raw optimized to allocate 1 file per thread BUG=

Patch Set 1 #

Total comments: 17
Unified diffs Side-by-side diffs Delta from patch set Stats (+172 lines, -53 lines) Patch
M core/src/main/java/com/yahoo/ycsb/Client.java View 3 chunks +1 line, -13 lines 0 comments Download
M core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java View 3 chunks +171 lines, -40 lines 17 comments Download

Messages

Total messages: 2
w.fakher.reda
Rawoptimized changes
6 years, 6 months ago (2017-09-12 12:19:22 UTC) #1
kirill.sc
6 years, 6 months ago (2017-09-12 15:34:33 UTC) #2
See my comments. 
I have ignored the rest of the file as it is not performance critical.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
File
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java
(left):

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:87:
private String printFormat = "text";
Add a comment stating what are other types available (if needed)

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:132:
RawDataPoint point = new RawDataPoint(latency);
This is completely useless operation. just save the timestamp in the local
variable. you dont need the object here. 
moreover, RawDatPoint seems to be a code duplication.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
File
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java
(right):

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:27:
import java.util.concurrent.ConcurrentHashMap;
Aren't these redundant because of the import above?

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:76:
public static final String TEMP_FILE_PATH = "measurement.raw.temp_file";
I would change the names:
TEMP_FILE_PATH -> RAW_MEASUREMENTS_OUTPUT_DIR  (i.e., It is not really a temp
(could be the same folder as ycsb) nor it is a single file)

measurement.raw.temp_file -> raw_measurements_output_dir

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:77:
public static final String TEMP_FILE_PATH_DEFAULT = "/mnt/ramdisk/";
RAW_MEASUREMENTS_OUTPUT_DIR_DEFAULT

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:101:
public static final String PRINT_DISPATCHING_TIMESTAMPS_PROPERTY =
"print_dispatching_timestamps";
I would add a bigger comment above. In the presence of various actual, inteded,
and R timestamps dispatching timestamp is not very clear

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:124:
totalLatency = new AtomicDouble();
Why do these variables need to be atomic? Why do we need to track it at
run-time? Cant each thread count its own totals and then merge at the end? 

See these:
https://stackoverflow.com/questions/3556283/in-java-what-is-the-performance-o...
https://dzone.com/articles/wanna-get-faster-wait-bit

basically synchronisation never comes for free. We should avoid it if possible.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:144:
public void measure(int latency) {
add a comment stating that this function is being called for every response.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:152:
Long threadId = Thread.currentThread().getId();
Can this be stored in thread local variable or passed into this function as an
argument? Can we avoid doing Thread.currentThread().getId(); for each operation?

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:153:
if (!outputStreams.containsKey(threadId)) {
ContainsKey has max complexity of O(n) we should avoid using it, moreover, since
threads are not communicating we don't need to use concurrent hash map at all.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:154:
if (!outputFilePath.isEmpty()) {
From the function flow this does not belong here. Function should store latency
samples not initialize output streams. 
Stream should be initialized somewhere in constructor, here we dont use any if
istatements.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:154:
if (!outputFilePath.isEmpty()) {
I would make it a preallocated array OR thread local variable -> no need to use
any getters.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:155:
outputStreams.put(threadId, new DataOutputStream(new FileOutputStream(
I haven't finished reading on it, but basically there might be incentive to use
buffered output. We might not be able to use RAM all the time. 

https://stackoverflow.com/questions/8109762/new-dataoutputstreamnew-fileoutpu...
https://stackoverflow.com/questions/9568072/performance-of-datainputstream-da...
https://bytes.com/topic/java/answers/684097-dataoutputstream-vs-bufferedoutpu...

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:162:
if (printFormat.equals("text")) {
No string comparison on this code path too expensive. 
We should create OneMeasurementOptimizedBinaryWriter that will use the correct
method.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:164:
String.format("%s,%d,%d", getName(), point.timeStamp(),
As I talked today, I am not very happy about this formatting at run-time. 
Consider dumping data as raw binary and then re-parsing as part of the
mergeFiles. That will save space and will reduce overhead.

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:167:
outputStreams.get(threadId).writeLong(point.timeStamp());
Can we write in one command? why do we have to call get() twice?

https://codereview.appspot.com/325430043/diff/1/core/src/main/java/com/yahoo/...
core/src/main/java/com/yahoo/ycsb/measurements/OneMeasurementRawOptimized.java:171:
catch (IOException e) {
I am not sure about java, but are these try{} catch necessary if you don't
create outputStream buffer but only use it? 
There is some little overhead in try catch as well. 
https://stackoverflow.com/questions/16451777/is-it-expensive-to-use-try-catch...
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b