Saturday, July 28, 2018

DB insertion speed test

To be used in my work, I tested which database inserts the records the fastest. Well, actually it was not much - I just created a table with integer(PK)/timestamp/integer and inserted 1 million records with only PK changing.

Benchmark environment is as follows:
  • Dell Inspiron 7373/i5 8th generation
  • Windows 10 Home
  • Target database(all 32bits)
    • PostgreSQL 10.6: used libpq
    • FIrebird 3.0.3: used OO API (C++)
    • SQLite3: #include <sqlite3.h> // ...... :P
And here comes the results and some remarks. I hope it would help visitors.
(I'm not that much good enough to make something open sourced, so this is the best I can offer to the community...... ;) )
  • Insertion speed
    • 1: SQLite, in-memory database: 2 seconds
    • 2: PostgreSQL, COPY: 5 seconds
    • 3: PostgreSQL, bulk insert: 7 seconds
    • 4: Firebird, bulk insert: 10 seconds
    • 5: SQLite, save to file: 56 seconds
  • Disk I/O
    • Thanks to Windows kernel, if the system receives too many small I/Os, it burdens disk too much, making it to the bottleneck
    • When using SQLite to save the result to disk where bulk insertion is impossible, disk I/O hit 100%
    • In Firebird, bulk insertion is implemented via PSQL, where you send a number of records and the each INSERT line in PSQL sequentially inserts the records; still it burdens the disk I/O very high(around 50%), though lower than SQLite
    • In Firebird, set the page size to at least 8192 to lessen disk I/O burdens
  • Bulk insertion
    • We can speed up the insertion using bulk insertion, but the speed has its top. It's not related with disk I/O.
    • Firebird: 10 records at once, regardless of page size
      • Sotred procedure 
    • PostgreSQL: 40개
  • PostgreSQL
    • COPY(source: CSV) is incredibly fast. Even the official documentation recommends COPY in bulk insertion
    • In libpq, the speed is around same whether you send the records in either text or binary
  • Miscellany
    • It's far faster to use the official interface directly rather than using any wrapper. When testing Firebird at first, with SOCI it took 59 seconds, while after applying the same logic to Firebird OO API directly the time taken is shrunk to only 20-something seconds.
That's all. Please let me know if you have any questions or comments. =_=/

DB insertion 속도 테스트

업무상 필요로 인해 어느 DB가 record insertion을 제일 빠르게 하는지를 테스트하게 되었습니다. 뭐 대단한걸 한건 아니고, integer(PK)/timestamp/integer로 구성된 테이블에서 PK값만 바꾸고 나머지는 고정값으로 해서 1백만개 레코드를 일괄 등록하는 프로그램을 만들어 돌려봤습니다.

대상 환경은 이렇습니다
  • Dell Inspiron 7373/i5 8세대 모델
  • Windows 10 Home
  • 대상 DB(모두 32비트 모델)
    • PostgreSQL 10.6: libpq 사용
    • FIrebird 3.0.3: OO API (C++) 사용
    • SQLite3: #include <sqlite3.h> // ...... :P
결과 및 시사점을 정리합니다. 도움이 되셨으면 합니다.
(아직 뭔가를 만들어 open source로 만들어 내놓을 정도의 실력이 되지는 않으니 이런거라도......)
  • Insertion 속도
    • 1위: SQLite, in-memory database: 2초
    • 2위: PostgreSQL, COPY: 5초
    • 3위: PostgreSQL, bulk insert: 7초
    • 4위: Firebird, bulk insert: 10초
    • 5위: SQLite, 파일에 저장: 56초
  • Disk I/O 관련
    • Windows 커널 특성상 small I/O가 많아지면 disk에 부담이 너무 많이 가 bottleneck이 됨
    • Bulk insert가 불가능한 SQLite 파일 저장 조건에서 disk I/O가 100%가 됨
    • Firebird의 경우, PSQL을 사용하여 bulk insert를 구현하는 형태로, 다수의 레코드를 한번에 받아 디스크에 개별 INSERT 명령어를 사용하여 순차적으로 쓰는 형태로 구현되므로, SQLite급까지는 아니더라도 disk I/O가 높게 일어남(disk I/O 50%)
    • Firebird의 경우 page size를 최소 8192이상으로 가져가는 것이 disk I/O 부하 감소에 도움이 됨
  • Bulk insertion 관련
    • Bulk insertion을 이용하여 insertion 속도를 올릴 수 있으나, 일정 수준 이상을 넘어서면 속도가 더이상 올라가지 않음. 이는 disk I/O와는 별개임
    • Firebird: 10개(page size와 상관없음)
      • Sotred procedure 
    • PostgreSQL: 40개
  • PostgreSQL 관련
    • COPY(원본: CSV 기반)가 규격외로 빠른 속력을 보임. 실제로 공식 메뉴얼에서도 bulk insertion에서는 COPY를 추천함
    • libpq에서 insertion 데이터를 보낼때 데이터를 text와 binary 중 어느 형태로 보내도 소요 시간은 동일함
  • 기타
    • 각 DB에서 제공하는 직접 연결 인터페이스를 사용하는 것이 wrapper를 사용하는 것보다 훨씬 빠름. Firebird로 최초 테스트시, SOCI 적용시 59초가 소요되었으나 Firebird OO API로 동일 로직 구현시 20초대에서 완료됨
이정도입니다. 혹시 궁금하신 점이 있으시면 문의 주세요. =_=/

Tuesday, July 17, 2018

Compile qdoc in Qt 5.11 with MinGW

From Qqt 5.11, qdoc depends on libclang.
Qt Creator is also said to use clang codemodel as their main C++ parser instead of their homemade one, and now qdoc follows.

OK. Good. Understandable..... But, we have a problem - libclang is not build with MinGW 5.3. LLVM buildbot claims that they use GCC 5.3, internally they seem to use MinGW 7.1. It looks like Qt will upgrade their MinGW to 7.3 from Qt 5.12(7.1 looks buggy so they have to use 7.3 instead).
So, now we have the core reason and solution. But in Qt 5.11, due to lack of time thanks to that target release date, in 5.11 Qt still uses MinGW 5.3 but they blocked building qdoc. And Qt 5.12, which is expected to solve the issue, will be released some time around Nov.29th. Yep. MinGW users should endure the pain of not-accessible-at-all to the documentation, which is one of the selling point for Qt(you can use online docs, but it's still unconvenient).

But, who are we? We're developers. And Qt is open source project. We can itch wherever is itching. So, I built qdoc with MinGW.

Prerequisites are as follows:
  1. Source code of Qt
  2. MinGW-w64: 7.3 or higher
  3. LLVM library prebuilt by Qt: it's a bit hard to find....... But here you go:
If you're going to build Qt Creator too, choose libclang-release_50-windows-mingw series(as of current clang codemodel supports only libclang 5.0).

To build qdoc by yourself, unzip Qt source code and libclang, and change the project file of qdoc. The location is as follows:

Put the following two lines into the top of the project file:
  • INCLUDEPATH+=/Qt_prebuilt_clang/unzipped/include
  • LIBS+=-L/Qt_prebuilt_clang/unzipped/lib -llibclang.dll
After that, build Qt as usual using configure and mingw32-make, and manually build

It would be great if there's someone who experienced inconveniences due to lack of documentation with MinGW, but is relieved with this.

Qt 5.11에서 MinGW로 qdoc 컴파일하기

Qt 5.11부터 qdoc이 libclang에 의존성을 가지게 되었습니다.

Qt Creator도 4.7부터 자체 C++ parser 대신 clang codemodel을 메인으로 가져가겠다고 했는데, qdoc도 clang에 의존하게 되었네요.

뭐, 다 좋은데...... 문제는 이 libclang이 MinGW 5.3에서는 빌드가 안된다는 겁니다. LLVM buildbot은 GCC 5.3으로 빌드한다고 주장하는 것 같습니다만, 실제로 내부에서 돌아가는 컴파일러는 MinGW 7.1이라고 합니다. 해서, 이런저런 사유로 인해 Qt 5.12에서는 MinGW 버전을 7.3으로 올리겠다고 내부 입장을 정리한 것 같습니다(7.1은 눈에 띄는 버그들이 있어서, 7.3을 써야 한다고 하네요).

문제의 원인과 해결책은 이미 나온 듯 합니다만, Qt 5.11의 경우 release date가 너무 촉박한 터라, Qt에서는 MinGW의 경우 5.3을 그대로 사용하고, 그 대신에 qdoc의 빌드를 막아버렸습니다. 그리고 이 문제를 해결할 것으로 예상되는 Qt 5.12의 목표 release date는 무려 11월 29일입니다. 옙. MinGW 사용자들은 MinGW를 이용한다는 이유 하나만으로 올해 11월 말까지 Qt의 자랑인 documentation을 쓰기가 어려운 상황이 되어버렸습니다(온라인이라는 방법도 있습니다만, 웹브라우저를 통한 검색은 역시 많이 번거롭죠).

하지만 우리가 누굽니까. 우리는 개발자입니다. 그리고 Qt는 오픈소스로 제공됩니다. 예. 필요하면 소스코드를 뜯어고치면(-_-) 됩니다. 그래서 MinGW로 qdoc을 빌드해봤습니다.

준비물은 다음과 같습니다:
  1. Qt 소스코드
  2. MinGW-w64 7.3 이상(-_-)
  3. Qt에서 제공하는 비공식 LLVM prebuilt library: 이게 좀 찾기가 까다로운데...... 여기서 구하실 수 있습니다:
Qt Creator도 함께 빌드하실 생각이시라면 libclang-release_50-windows-mingw 시리즈를 선택하세요(clang codemodel은 현재 libclang 5.0만을 지원합니다)

qdoc을 직접 빌드하시려면, 적절한 곳에 Qt 소스코드와 libclang의 압축을 풀고, qdoc의 프로젝트 파일을 변경합니다. 위치는 아래와 같습니다:

그리고 맨 윗줄에 아래의 두 줄을 추가합니다:
  • INCLUDEPATH+=/Qt_prebuilt_clang/unzipped/include
  • LIBS+=-L/Qt_prebuilt_clang/unzipped/lib -llibclang.dll
이후 configure -> mingw32-make를 실행하여 Qt를 빌드한 후, qdoc.pro를 별도로 빌드해주시면 됩니다.

혹시 MinGW로 Qt를 직접 빌드하시면서 Qt documentation이 없어서 불편을 겪으셨던 분들이 계시다면, 이걸로 고생이 덜해지시길 바랍니다.

Tuesday, June 12, 2018

MinGW 8.1.0 vs. VC++2017: performance benchmark

I don't know why, but after writing down that I'm going to VC++2017 yesterday I took a simple benchmark testing. No, it was not that much delicate: it reads a file via ifstream and in the memory it parses the data, and the performance was checked with GetTickCount(). The result was that MinGW was always 10~30% faster than VC++2017.

Hmmm...... Do I have to get back to MinGW?

MinGW 8.1.0 vs. VC++2017: 수행성능 비교

어제 VC++2017로 가겠다고 써놓고 무슨 바람이 불었는지 간단한 퍼포먼스 벤치마킹을 수행했습니다. 제가 능력이 출중해서 정교한 실험을 한 건 아니고, 그냥 ifstream으로 파일을 읽은 후 메모리에 올려서 약간의 parsing을 하는 작업을 GetTickCount()로 찍어서 체크해봤는데, 제 사용 사례의 경우에는 MinGW가 VC++2017보다 항상 10~30%정도 더 빠르더군요.

MinGW로 돌아가야 되나...... 싶습니다. -_-

Monday, June 11, 2018

Going Native

A lot of things happened behind the scenes. Due to some unflexible cash flow my company closed Korea branch, and I moved to a small company and got dual role of system engineer and developer. Well, the company first expected to do engineering and development works in 50:50 manner, but in practice I concentrate on development itself. I think it's a good start for a ex-hobbiest-programmer where the customer changed their feedback from negative to positive after my application is delivered to them.

And, during the time I thought of the techonlogy stacks I should bring with myself. Though Qt can cover everything from desktop to mobile, and is equipped with a lot of powerful features, after applying the tool to the work situation things dragged me of my legs. For example:

  1. It's BIG. HUGE. OK. It's good to have a lot of things, but the size of library grows bigger with any new addition of the features. For simple tasks, I can't avoid the feeling of using zweihander to grass cutting.
  2. SLOW. From Qt Model-View framework which is a failure and needs restructuring, to container classes like QList or QLinkedList saying "we're concentraing on convenience while STL stresses on raw speed," and the speed of database manipulation is not that good eighter. For databases, I found out it's far faster to use SOCI after converting things from QString to std::string. From my experience, for the same task Qt took about twice or more time to perform the same task against others(e.g. STL or wxWidgets).
  3. Developer is "locked" into the Qt ecosystem. Well, I admit its rich feature gives developers more development convenience, but after adopting to Qt I don't feel good with using it with other libraries or environments. Sometimes I feel I'm working on Java instead of C++, in terms of not only structue but also running performance of the system(meaning that it's slow, unlike C++). During coding, I'm not sure whether I'm using Java or c++. Ah, of course signal-slot mechanism and QString is such a technologicla advance in a decade.
  4. It has some holes here and there, for example, Qt Quick. I love the approach and structure of Qt Quick and I admit it's such an accomplishment, I see occasional "broken" frames on Windows/Intel environment, or I see too long time for the first run on Andorid(they existed until Qt 5.9. I heard that the bytecode compilation in Qt 5.11 resolved the slow first start a lot, but I had no chance to test it). Or, for me a MinGW user, in Qt 5.10 qdoc(and that Qt documentation) suddenly became unavailable because MinGW 5.3 couldn't compile libclang(......). And the new autompletion with libclang in Qt Creator is said to be toooooo slow....... And so on and on and on. 
  5. Sometimes there are things infeasible to programmers. Qt hides the implementation details in its private layer, which makes it too difficult to debug when some bad things happen, especially when it's related with some specific feature of Qt itself. What's more, due to the behavior of QURL to force percent-encoding(and which is malfunction in specific context), I had to give up QNetworkAccess and adopt libcurl(with additional coding where I had to implement about 3/4 of my network coding again......).

Well, anyway it's BIG enough to be tackled from anywhere, but you know, now I can say there's no one library/framework that fits too all. So I looked for any alternatives and asked for help from my close friends. And guess what? Now I'm dropping the dependency against Qt and bringing development staks close to native "recommended" environment.

  1. C++ / wxWidgets & more: whoever says what, C++ is my main development platform. And I concluded wxWdigets, thin wrapper of native widgets, can replace Qt Widgets. Thinking of my past experience, I had to save my time so that I had to use Qt Widgets anyway(and Qt Quick took relatively too much time to develop). wxWidgets consumes less memory and bootstrapping is faster, which fits to my philosophy of "relatively fast running."
  2. Ah. I changed my compiler from MinGW to VC++. I was using MinGW to expect consistency against platforms, namely Windows and Linux, but nowadays I had to use some Windows API which doesn't exist in MinGW(:P). And it looks like VC++ looks good from VC++ 2017. I think I should conquer the differences between compilers with some nuts.
  3. Javascript: Javascript is the language of web! As soon as I'm done with my current project, I'd like to start learning vue.js and node.js(especially Express.js) and apply them to the work.
  4. Kotlin: OK. Now the desktop is done, but still "mobile" remains. Kotlin is native in Android(and I strongly feel that Google pushes it harder than ever after it lost the case against Oracle), and when Kotlin/Native is copmlete iOS can be "natively" supported. It's a good choice for me, who won't touch Apple ecosystem for a few years but no idea after that.

Originally I added Python and Rust to the list, but I found out that I don't have to dig deep Python though it's used in the product I currently engineer, and I concluded that from Rust I can adopt the structure to C++ so I don't have to worry much. And I also considered Go but its structure was not of my taste......

So that's all. At present, I just want to close the current project and have some time to learn Javascript.