Code{nested}

I'm moving to another blog platform

2024-10-11T22:50:00.002+09:00

Yeah. I'm moving to another blog platform.

Blogger is quite a legacy and it's hard to post stuff(e.g. beautifying codes), and there are seldom readers coming in, so I move my blog to dev.to, where I can expose my posts to more "related" people. I'd like to archive this blog after moving some articles.

Now, see you on https://dev.to/teminian

블로그를 이전합니다

2024-10-11T22:48:00.001+09:00

옙. 블로그를 이전합니다.

Blogger 자체가 꽤 오래된 플랫폼이다보니 포스팅이 좀 불편하기도 하고(이를테면, 코드를 보기 좋게 올린다던가......), 사용자 유입이 거의 없다시피 하기도 해서(......), 좀 더 많은 관계자들에게 노출되는 플랫폼인 dev.to로 이전합니다. 이 블로그의 글 중 일부를 천천히 옮기는 것을 시작으로 이 블로그를 archiving하려 합니다.

https://dev.to/teminian 에서 뵙겠습니다.

AI can't read between lines

2024-08-23T23:01:00.001+09:00

One of my acquaintances make schedules for his colleague's daily leave schedule. Well, isn't it decided by the individuals themselves? Yet, there was a reason. The workplace has to be open for 7 days a week and employees are expected to take a leave for two days a week. If the selection is left to the individuals, they will certainly select specific weekdays(e.g. Saturday or Sunday), so he had to distribute the leave schedule himself.

But as you can imagine, the schedule can't be well randomized if it's done by hand. Even simple pseudorandom from your computer will generate far better result. So, he decided to use Microsoft Excel to automate the scheduling and get help from AI. And he tells the conditions:

Each employee can take a leave of 8 days per month
Weekday is not important, but the distribution needs some randomization
The schedule should be overlapped as less as possible

Well, it seems to be fine until now. Isn't it? AI can generate some Excel spreadsheet. But when he added the following condition, AI goes brutal - its result generate error.

Each employee should be able to take a leave at least once per week

If you're good at dealing with Excel, you'll quickly understand it can't be accomplished by "pure" Excel without support of VBA macros. To satisfy the above conditions I need to know when others are taking a leave, but for the others the same applies. To know me I've got to know you, but to know you I've got to know myself, ....... Yep. A typical "recurring reference" situation. If you add such conditions, AI will just "give it up" and will tell whatever it wants to say. Forget prompt engineering. The best answer shall be "it should be done with VBA macros", or it can't make it happen like a magic.

Did you decide to develop a dedicated application software program? Yet still you've got to consider a lot. Scheduling leave schedule for a month, you need to avoid overlapping as much as possible, consider weekly schedule when building the whole monthly schedule, and decide how to deal with non-full-weeks in front or end of the month. Many points to consider, and many exceptions to handle.

But, this is only valid only when you deal with the problem as it is. Yes. AS IT IS.

Thinking a bit further, you can simplify and reassemble the problem and it can be solved quite easily. If you've got to leave once a week, then you shall have leaves sufficient to leave at least once a week. If you divide the full month to number of leaves(in this case 30/8=3.75 days) and spread the leave schedule, all the employees can take a leave at least once a week. Of course in this case the randomization becomes limited when compared with original condition of "leave at least once a week", but from the employer's perspective, leaving only once a day for 3 weeks and leaving the remaining 5 days in week 4 will give him quite a headache. :P

Large Language Model(LLM) can understand the natural language and understand what he says. But, the core of LLM is statistics and probability, and based on which patterns are shown the most when we divide our language to syntactic word level. In other words, if a human gives a sentence, it has no problem in understanding it as it is, but reading so called "between the lines" or "hidden premise" and restructuring the problem itself-we call it a inference- is beyond the level current technology can achieve.

And yes, this is when human intervention is needed.

AI can generate source code, and some people say that developers won't be needed anymore, well, still it's only human who can "truly understand" the depth of the problem and reassemble in a way that the problem can be easily solved. And human acknowledges many conditions which are usually not given to AI. Though I don't know, I think this is related with prompt engineering.

So, my dear fellow software developers in the world! Now is not the time.
Until AI can "infer properly", your desk is safe!

인공지능은 행간을 읽지 못한다

2024-08-23T22:58:00.009+09:00

지인 중 한 분이 직원들 휴가 일정을 짜고 있었습니다. 휴가 일정은 휴가 가는 본인이 알아서 짜야 되는거 아니냐...... 했는데, 알고보니 주 7일 로테이션 근무제였고, 직원들이 돌아가면서 일주일 중 이틀씩 쉬는 형태이고, 이걸 무조건 개인의 선호대로만 하면 특정 날짜에 사람이 몰린다던가 하는 문제가 있어서 부득이하게 선임이 직원들 휴일을 지정해주는 형태가 되었다 합니다.

그런데, 이게 무작위로 한다고 해도 사람이 직접 뿌리면 무작위가 될 리가 없지요. Pseudorandum이라고 해도 컴퓨터가 무작위로 뿌려주는게 사람이 직접 분산시키는 것보단 훨씬 낫다는 거야 자명한 사실입니다. 그래서, 그저 한 직장의 현장직 관리자로서 개발과는 아무런 연이 없던 이 분은 마이크로소프트 엑셀로 이를 자동화하기로 하고, AI 시스템의 도움을 얻기로 합니다. 그리고 조건을 거는데......

휴가 일정은 각 인원당 한 달에 8일
요일은 상관없으나 무작위성 필요
휴가 일정은 최대한 겹치지 않도록 배분

뭐, 여기까지는 어떻게든 된 것 같습니다. AI가 시키는데로 Excel에 넣기만 하면 척척 나오니까요. 다만 조건을 하나 더 추가하니까 AI가 슬슬 배째라 모드로 들어갑니다.

1주일에 최소 한 번 이상 쉴 수 있어야 함

엑셀좀 만져보신 분들이라면 이 조건이 추가되는 순간 이 문제는 VBA 없이 순수(?)엑셀만으로는 절대로 해결이 불가능하다는 것이 바로 감이 잡히실 겁니다. 위의 조건을 모두 만족하려면 내가 언제 쉬었는지를 알아야 되는데, 그걸 알려면 다른 사람이 언제 쉬었는지를 알아야 되고, 이게 모든 사람들에게 공통적으로 적용됩니다. 즉, 나에 대해서 알려면 남을 알아야 되는데, 남을 알기 위해서는 나를 알아야 하고...... 예. 이건 전형적인 순환 참조 구문입니다. 당연한 이야기입니다만, 이런 조건을 AI에 추가하기 시작하면 그때부터 AI는 배를 째라는 식으로 나오죠. 프롬프트 엔지니어링을 백날 해봐야 구조적으로 불가능한걸 되게 할 수 있는 마법같은 뭐시기는 못 만들어냅니다. 아무리 잘 말해봐야 가장 좋은 답변은 "엑셀만으로는 안되고 VBA 등을 사용하셔야 합니다"일 겁니다.

엑셀을 사용하지 않고 아예 전용 프로그램을 개발하더라도 위 조건은 구현이 생각보다 그리 쉽지 않습니다. 한 달간의 일정을 짜는데, 가능한한 겹치지 않아야 되고, 월단위 일정을 따면서 주단위 일정의 구성에 신경을 써야 할 뿐 아니라, full week가 되지 않는 월초나 월말의 며칠은 어떻게 처리해야 될지도 신경을 써야 됩니다. 이래저래 신경써야 할 조건도 많을 뿐더러, 예외처리해야 할 부분도 한두개가 아니에요.

다만, 이건 어디까지나 문제를 액면 그대로 받아들였을 때의 이야기입니다.

조금만 생각해보면, 이 문제는 꽤나 간단하게 풀립니다. 이를테면, 휴가가 한 달에 8번이고, 일주일에 최소 한 번은 쉬어야 한다는 조건은, 뒤집어서 생각하면 휴가가 무조건 일주일에 한 번 이상 쉴 수 있도록 주어진다는 것입니다. 요컨데, 한 달을 휴가일수로 나눈 기간(여기서는 30/8=3.75일) 내에서 휴가를 균등하게 분산하면 일주일에 무조건 한 번 이상 쉴 수 있게 됩니다. 이 경우 원래 조건인 <일주일에 무조건 한 번 이상 쉰다>에 대한 무작위성이 제한되긴 하지만, 사업주 입장에서도 3주를 일주일에 한 번씩만 쉬고 네번째 주에 5일을 쉬는 엄한 결과를 내는 것보다는 저렇게 최대한 균등하게 배포하는 쪽이 직원들의 사기 등 관리에도 유리할 겁니다.

대규모 언어 모델(Large Language Model, LLM)은 사람의 자연언어를 분석하고 이해할 수 있을 정도까지 발달했습니다. 다만, LLM의 핵심은 통계고, 문장을 어절단위로 분석했을 때 어떤 패턴이 얼마나 자주 나타나는지를 학습한 결과에 기인합니다. 요컨데, 사람이 문장을 주면 그 문장 자체를 이해하는데는 아무런 문제가 없지만, 속칭 행간이라 불리는 보이지 않는 부분을 읽어 문제를 처음부터 다시 구성하는 등의 추론적 사고는 어렵다 하겠습니다.

그리고 당연한 이야기지만, 이 부분이 바로 사람이 필요한 부분이죠.

아무리 인공지능이 알아서 소스코드까지 짜주는 시대이고, 혹자는 앞으로 개발자는 필요없을 거라면서 회의론을 부르짖고 있지만, 문제를 제대로 이해하고 이를 토대로 문제를 해결할 수 있는 형태로 재구성할 수 있는 능력은 아직 사람만이 할 수 있는 것 같습니다. 그리고 사람은 AI에게 주어지지 않은 수많은 조건들을 알고 있지요(고객사 담당자의 성향이라던가, 말하지는 않았지만 당연히 되어야 한다고 생각하는 부분이라던가......). 잘 모르는 분야입니다만, 프롬프트 엔지니어링과도 통하는 바가 있지 않을까 하는 생각도 듭니다.

전세계의 개발자 여러분! 아직은 때가 아닙니다.
AI가 제대로 된 추론을 하지 않는 이상 여러분의 책상은 안전합니다.

Lite-XL과 함께한 5일

2024-07-25T10:20:00.003+09:00

고대 중국에 나를 날마다 새롭게 한다는 뜻인 일신우일신(日新又日新)이라는 고사가 있습니다. 이 말을 가슴에 안고, 저는 종종 개발 환경에 변화를 추구하곤 합니다.

한동안 저는 Vim으로 대표되는 "modal editing" 환경에 좀 회의적이었습니다. 아니 왜 내가 삽입/일반/비주얼로 모드를 바꾸는데 신경을 써야 돼? 모드를 변경할 필요 없이 모든 작업을 수행할 수 있다면 모드 변경에 사용되는 시간과 써야 되는 신경을 절약할 수 있지 않을까?...... 뭐 이런 거죠.

이런 생각 아래, 저는 제 소중한 Vim을 대체할만한 무언가를 찾았고, 그 와중에 Lite-XL을 찾았습니다. 지금은 Vim으로 돌아온 상태입니다만, Lua로 개발된 이 에디터와의 5일동안의 기억은 꽤나 신선해서, 이를 기록으로 남기는게 좋겠다는 생각이 들었습니다.

무지하게 작은 footprint

말 그대로입니다. Vim처럼 footprint가 엄청 작습니다. 사실 메모리 사용량 자체는 Vim보다 살짝 높은 수준이긴 하지만 요즘 나오는 IDE급으로 징하게 무거운 코드 에디터들에 비하면야 애교 수준이고, Vim보다 높다는 것도 이게 Lua 기반이라는걸 고려해보면 뭐 나쁘지 않죠.

하지만 이 이야기인즉슨, 바닐라 상태에서는 Vim처럼 정말 아무것도 없습니다(......). LSP라던가, 들여쓰기 가이드라던가, 선택된 문자열과 동일한 문자열을 하이라이트하는 등의 기능들을 모두 플러그인 형태로 따로 설치해야 됩니다.

Vim에 .vimrc가 있다면, Lite-XL에는 .config/lite-xl/*이 있습니다. 이 디렉토리에 존재하는 Lua 스크립트를 이용하면 환경 설정이나 초기화 등을 쉽게 수행할 수 있습니다. 물론, 여러분이 저처럼 Lua를 모른다고 해도 큰 문제는 없습니다. 써있는 설명만 잘 따라하면 되더군요. 뭐...... 저같은 경우에는 혹시 놓치는게 없을까 싶어서 조심스럽게 읽어야 하긴 했지만, 그건 제 모국어가 영어가 아니라서 그럴 것 같습니다. ;)

반응속도는 최강

Lite-XL은 언제나 지연 없이 반응을 수행했습니다. 스크롤 애니메이션은 덤이고요. 파일간 검색부터 명령 팔레트에 이르기까지 "나 엄청 가볍다니까!"하고 외치는 듯한 느낌을 받았습니다. 쾌적하더군요.

LSP: 좋기도 하고 나쁘기도 하고

LSP 플러그인은 만족스럽습니다. 플러그인의 공식 github 레포지토리에서는 아직 플러그인이 완성되지 않았다고 하긴 합니다만 대부분의 주요 기능들은 잘 동작하는 상태고, 현 상황을 알려주는 오버레이 또한 친절하고, 뭔가 "끼어드는 듯한 느낌"이 없어서 좋았습니다(Vim LSP 플러그인이 띄우는 수많은 Virtual Text를 보셨다면 "끼어드는 듯한 느낌"이라는게 뭔지 이해가 되실겁니다). 오버로드 표시도 Lite-XL이 훨씬 뛰어나더군요.

하지만 심볼 검색 다이얼로그나 명령은 좀 헷갈리는 감이 있고, 몇몇 유틸리티들, 이를테면 심볼 타입 목록을 포함하는 심볼 검색 등은 빠져 있어서 좀 허했습니다. 물론 전체적으로는 괜찮은 편이라고 생각합니다. 이건 제가 새로운 인터페이스에 익숙해지기 위해 충분한 시간과 노력을 들이지 않아서, 그리고 제 업무방식을 Lite-XL에 맞게 바꾸지 않아서 혼란스럽게 느껴진 것일 수도 있거든요.

거의 모든 곳에 존재하는 단축키

이 에디터의 개발자들은 자신들의 오른손이 키보드와 마우스 사이를 왕복하는게 꽤나 거슬렸던 것 같습니다. 웬지 디자인이 마우스 사용을 최소화하기 위해서 노력하는 느낌이에요.

자주 사용하는 기능이라면 거기에는 어김없이 단축키가 설정되어 있습니다. 전 Lite-XL 개발자들이 자신들의 개발환경에서도 적극적으로 Lite-XL을 사용하고 있고, "속도광"일 것이라는 느낌을 받았습니다. 타이핑할때 절대로 시간을 허투루 소비하지 않겠다는 집착이 보인달까요.

다만 딱 한 가지, 윈도 분리에서는 좀 헷갈리긴 했습니다. Lite-XL이 <Alt> + ijkl을 사용하는 반면, Vim은 Ctrl-W에 그 유명한 hjkl 조합을 사용하는 터라 방향을 헷갈리지 않을 수가 없었네요. :P

너무 큰 파일은 잘 못 다룹니다(feat. 4GB.txt)

흔하지는 않지만 전 가끔 몇 GB정도 되는 로그 파일들을 다뤄야 할 경우가 있습니다. Vim 9은 이 측면에서 매우 뛰어납니다. 몇 초 안에 모든 준비를 끝내지요.

동일한 파일을 Lite-XL에서 열려고 해보니 몇 분 정도의 시간을 기다려야 하고, 간혹 에디터가 버벅대기도 하더군요. 스크립트 언어에서 개발되어서 그렇다는 생각은 들지 않습니다. 멀리 갈 것 없이, Javascript로 만들어진 Visual Studio Code도 동일한 파일을 꽤 잘 다룰 수 있었거든요(뭐..... 메모리 사용량만 빼고요).

어쨌든, 전 이게 모든 분들에게 해당되는 사항은 아니라고 생각합니다. 굳이 매우 큰 텍스트 파일을 제어할 필요가 없다면 굳이 신경쓸 필요는 없으실 겁니다.

편집중 git checkout 금지 (응?)

Lite-XL로 몇몇 파일들을 열어놓은 상태에서 프로젝트에 git checkout을 실행해 몇몇 파일을 변경한 결과 에디터가 죽어버렸습니다(얼라리요). 그냥 운이 없었던 건지 버그인지는 모르겠지만, 하여간 뭐...... 그리 되었습니다.

작고 기민하지만 몇몇 부분에서는 아쉬운 에디터

대한민국에는 여자들은 열과 성을 다해서 식당 별점 및 리뷰를 쓰는 반면, 남자는 딱 두 가지 경우, 그러니까 "희생양은 나 혼자면 충분하다!"라고 할 정도로 개판(......)이거나, "이렇게 좋은 곳은 무조건 널리 알려야 한다!"라고 할 정도로 뛰어난 식당을 알게 된 경우에만 리뷰를 작성한다는 농담 아닌 농담이 있습니다.

Lite-XL의 경우에 대해서는, 이건 명백히 후자입니다. 특히나 Visual Studio Code를 대체할만한 가벼운 에디터를 찾는데, Vim의 미친듯한 낭떠러지급 학습곡선을 피하고 싶다면 강력히 추천합니다(참고: 전 Vim 사용자이긴 합니다만 Vim이 생산성을 최대화하는 유일한 길이라는 생각에는 적극 반대합니다. 목표를 달성하는데는 한가지의 길만 있는게 아니죠. 그런 측면에서 보면 VS Code는 매우 많은 사용자 환경과 경우의 수를 커버할 수 있는 폭넓은 환경을 제공합니다. C++만 봐도 MS의 Intellisense와 clang 중 하나를 선택해서 사용할 수 있는 경우는 흔치 않죠).

저의 경우는 몇몇 기능이 부족해서(특히 대용량 파일 제어부분이 뼈아팠습니다) Vim으로 돌아오긴 했습니다만, 여러분의 경우 만일 큰 파일을 제어할 일이 없으시다면 한 번 시도해보시는 것도 나쁘지 않을 것 같습니다.

A 5-day journey with Lite-XL

2024-07-25T09:43:00.011+09:00

There's an ancient Chinese saying(which is also famous in Korea), 日新又日新, meaning that "renew myself day by day." With the saying in mind, occasionally I try new development environments and tools, as other good developers do.

During the time I was a bit skeptical about "modal editing" scheme, represented by Vim, changing modes-Insert, Normal, Visual- to do certain stuff. Why would I be concerned with the mode? If I can do everything without changing modes, wouldn't it save both time and my mental resources(e.g. being concerned with mode)?

With that in mind, I tried to search for a good substitute for my precious Vim and amid the search activity I found Lite-XL. Though I'm back to Vim, the 5-day journey with the Lua-based coding editor was impressive, so I'd like to leave a record to remember the fun during the journey.

Too small footprint

Literally. Like Vim, it has small footprint. Though memory consumption was a bit higher than Vim for fresh run, but compared with modern "heavyweight" IDE-like code editors, and considering that it's built in Lua, a versatile scripting language, it was impressive.

However, it also means that it barely has anything, like vanilla Vim. To be more productive, you need to install and configure plugins, e.g. LSP, indent guide, or highlighting same words of current selection in the document.

And where there's .vimrc in Vim, there's .config/lite-xl/* in Lite-XL. There are a handful of Lua scripts there, and you can add your own configurations and initializations as needed. Configuration is fully manual and in Lua, but you no worries even though you're not familiar with the language, like I do. Follow the instructions for each plugin and you'll be fine, though I had to be careful to not lose any details when reading them(maybe that's because I'm not an English speaker? ;) ).

Fully responsive, always

Lite-XL was always responsive with nice scrolling animations. Whether it be searching among files or dealing with command palette, it was always like shouting "I'm lightweight enough so that I can fly!".

LSP: some better, some missing

LSP plugin is satisfactory. Though the official github repository for Lite-XL LSP plugin says that it is WIP, but still most major features are already ready to serve, and the overlay was quite informative, and, most of all, non-interrupting. I'm sure you know what I mean if you saw messages in virtual texts when running Vim LSP plugins. Even showing overloads was better with Lite-XL.

However, its symbol search dialogs and commands were a bit confusing, and missed some small utilities I enjoyed(e.g. symbol search with full symbol list and their types). But that's fine - maybe I was just not familiar with new interface, and I didn't invest ,much time or efforts on changing my workflow for the unfamiliar interface.

Weapons hot keys (almost) everywhere

The developers must be concerned with their right hands traveling between keyboard and mouse, and they want to minimize using mouse.

Where there is a frequently used feature, there's a hot key dedicated for it. I'm sure that Lite-XL developers must use Lite-XL themselves on developing stuff and they're also "speed freaks", to NOT allow any time loss regarding your keystrokes.

The only thing I was confused was that window splitting - Lite-XL used `<Alt> + ijkl` while Vim used `Ctrl-W` with its famous `hjkl` combination. :P

Handling too big files is slow (feat. 4GB.txt)

It's rare but sometimes I have to deal with log files with a few GBs in size. Vim 9 handled them really well. It opened the file in a few seconds and the file is already ready to serve.

When I tried to open the same file with Lite-XL, it took a few minutes to open, and sometimes the editor lagged. I don't think that's simply a limit for scripting language, as I saw Visual Studio Code, written in Javascript, could handle the same file really well, except for memory consumption(lol).

Anyway, I think that's not the case for everyone, so I think you can safely ignore if you don't have to handle REALLY BIG text files.

Don't do git checkout on editing (huh?)

While opening some files with Lite-XL, I git checkouted my project, some open files changed, and the editor crashed(oops). I'm not sure whether I was just unfortunate or it was a bug, but anyway it happened.

Small, versatile, but a few oops

There's a joke in Korea that while women are enthusiastic about rating and reviewing restaurants, men write the review in only two occasions: it's so bad that you'd announce "don't go there it's enough only for me to be the scapegoat", or it's really great that you think "man it's damn great I want to spread the words so that the restaurant can sustain more."

For Lite-XL, this is the latter. It's damn great, especially if you're in thirsty for lightweight alternative against Visual Studio Code yet don't want to face the deep valley of learning curve for Vim(one thing: though I'm a Vim user, I don't think everyone needs to learn Vim for their maximum productivity. Rather, I'm against it - there are so many ways to accomplish your goals. Having such in mind, I think VS Code can satisfy and cover quite a lot of use cases and ways to do things, like choosing between Intellisense and clangd for assisting C++ development).

I had to return to Vim because I missed a few things(handling GB-size files was critical), but if you don't have to deal with big files, I strongly recommend to give it a try.

오늘의 얼라리요 20240628

2024-06-28T12:20:00.000+09:00

오랜만입니다! 오늘도 또다른 '얼라리요' 건을 소개할까 합니다.

뭐, 별거 없죠? 문법도 괜찮고....... 하지만 사실 이 코드는 원래 이런 구조여야 했습니다:

뭐, 다들 이런 경우 한 번 씩은 경험해보지 않으셨으려나요?

Today's Oops 20240628

2024-06-28T12:00:00.016+09:00

Long time no see! Today I'd like to share yet another OOPS moment.

Nothing much, huh? But the code should have been like following:

Well, don't tell me you had no chance to experience something like this. ;)

오늘의 얼라리요 20240528

2024-05-28T12:29:00.004+09:00

예예. 잘 알고 있습니다. 제가 이 블로그에 글을 쓴 지가 몇 년이지만 포스팅은 거의 없지요. 그런데 어느날 갑자기 아이디어가 떠올랐습니다. 만일 자폭개그를 쓴다면 사람들이 빵 터져주지 않을까?

해서...... 첫번째 드랍 갑니다:

자, 이 코드에서 뭐가 잘못되었는지 찾아보세요. :D
(이 코드는 실행되지 않습니다.)

Today's Oops 20240528

2024-05-28T12:29:00.000+09:00

Yes. I know. I know. I've been writing in this blog for years, but there are seldom postings. And all of sudden, an idea popped up: I think it might giggle someone if I push the "self-destruction" button myself by sharing my mistake during coding.

And here comes the very first drop:

Now, find what's wrong with the code above. :D
(it doesn't compile at all.)

낯선 Rust에서 오래된 Object Pascal의 향기를 느끼다

2024-04-20T13:26:00.005+09:00

요즘 새로운 프로그램을 만드는데, 이 참에 공부좀 해보자 해서 Rust로 만들어보고 있습니다.

아무래도 멘땅에 헤딩하다 보니까 고조할아버지 제삿날 종가집 시어머니급 잔소리를 늘어놓으며 사사건건 시시콜콜하게 간섭해대는 rust-analyzer와 싸우면서(?) 즐거운(?) 나날을 지내고 있습니다만...... 오늘 재미있는걸 하나 발견했습니다.

Rust에서는 이 구문을 오류로 보더군요.

이걸 수정하려면 이렇게 고쳐야 됩니다.

이 코드를 보니 예전 Object Pascal (Delphi) 시절이 생각나는군요. primitive type이 아니면 var 절에서 객체 변수를 선언한 뒤에 구현부에서 꼭 초기화를 시켜줘야 하고, 그렇지 않으면 바로 runtime error를 뻥뻥 토해냈는데, 구조가 완전히 똑같습니다. Rust도 마찬가지이긴 합니다만, 차이점이라면 메모리가 할당되지 않았다는걸 컴파일 타임에 발견해낼 수 있다는 정도일까요.

Pascal이란 언어는 자기 자신은 비주류인 주제에 온갖 잡다한(?) 언어들에게 무수한 영향을 끼치는군요. Java도 Python도 Javascript도 심지어는 Rust도 모두 다 Object Pascal의 구조를 일정수준 이상 차용해왔으니......

과거 해당 언어의 추종자로서, 아련한 기분이 듭니다.

See old Object Pascal from new Rust

2024-04-20T12:59:00.004+09:00

Nowadays I'm working on a new application, and I thought it's a good chance to learn something new, I tried Rust.

As a newbie(or noob) in this area, I enjoy the time fighting against rust-analyzer that always-whining like Grouchy Smurf...... (lol) And today I found something interesting.

In Rust, the following is an error.

To fix, the first line should be changed like following.

Seeing the code reminds me of the good old days of Object Pascal (Delphi). If it's not an primitive type you've got to declare the variable in var clause and call constructor in implementation, or it'll emit runtime error. And Rust "inherited" the structure as it was, except for catching non-memory-assignment in compile time.

The programming language Pascal is a minor one as it is, it influences to too many other languages, like Java, Python, Javascript, and now Rust....... They all adopted at least some part of Object Pascal.

As a good follower of the language, I feel dim as I saw this.

PostgreSQL vs. SQLite: read & write in multithreaded environment

2024-03-07T10:12:00.004+09:00

The start was humble. I needed to cache some data, and I thought just push them to database table and give index, and the rest will be database's job. There were only 2 TEXT fields, and I needed to refer to only one field to search for specific row - which is some kind of key-value store -, so I thought whatever database engine should be fine.

And yes. It was a BIG mistake.

First I tried SQLite, and I found out that, in multithreaded environment some records are evaporated when trying to write to the table simultaneously, even with -DSQLITE_THREADSAFE=2 compile time option. I pushed the same data in same condition, and sometimes I have only 20 records, other times 40, and yet 26 for some others....... What drove me crazier was that the SQLite itself worked fine without any I/O problems. A good moment to shout "WHAT THE HELL?!" in real time.

So I changed the engine to PostgreSQL. Our trustworthy elephat friend saved all the records without any loss. I was satisfied with that, but...... Though I applied b-tree index to necessary field of the table, it took 100 milliseconds for just running SELECT field2 WHERE field1='something'. No, the table was small enough. There were only 680 records and data lengh was at most 30 characters for field 1 and only 4 characters for field 2. I configured the engine with some optimization, so it worked fine for bigger tables so I felt assured for its performance, but I didn't expect something like this, even in my dreams.

Elephant is tough, but as a side effect it's too slow.......

So, one last chance: I ran pg_dump to move data from PostgreSQL to SQLite, and with same condition(same index, same table structure, ......), I turned on at .timer SQLite shell and it took less than 0.001 second. Yohoo!

After some more experiments, SQLite can't fully resist from data loss by itself even with multithread support option enabled, and you need more external support like std::mutex. I guess that it's fread() call doesn't support full serialization in multithread environment, but I have neither time nor abilities to do the proper inspection. :P

Anyway, now I use the combination of SQLite + WAL mode + more SQLite internal cache + std::mutex. Still the write performance looks good, but if needed, I think I could use more files with load balancing via non-cryptographic hash.

PostgreSQL vs. SQLite: 멀티스레드 환경에서의 읽기-쓰기

2024-03-07T09:53:00.004+09:00

그러니까....... 시작은 소소했습니다. 뭔가 데이터를 캐싱할 일이 있었는데, DB에 쌓아두고 index 걸면 나머지는 DB가 알아서 하지 않겠느냐 하는 거였습니다. 데이터라고 해봐야 별거 없이 그냥 TEXT 필드 두 개가 전부인데다가 실제로 데이터를 찾을 때는 둘 중 하나만 가지고 찾으면 되는 매우 간단한 key-value store 형태의 구조라, 어떤 DB를 써도 상관없겠지 하고 안일하게(.......) 생각했습니다.

옙. 그건 큰 착각이었습니다.

처음에는 SQLite를 사용해봤는데, 멀티스레드 환경에서 동시에 쓰기를 수행하니 컴파일시 -DSQLITE_THREADSAFE=2 옵션을 추가해도 일부 레코드가 유실되더군요. 동일한 데이터를 동일한 조건에서 동일하게 넣는데 어떨 때는 레코드가 20개만 있고, 어떨 때는 40개, 어떨 때는 또 26개...... 게다가 심지어 동작에는 이상이 없습니다(......). 정말이지 What the hell을 라이브로 외치기 딱 좋은 순간이죠.

그래서 DB를 PostgreSQL로 바꿨습니다. 우리의 우직하고 단단한 코끼리 친구는 레코드 유실 없이 모든 데이터를 다 받아서 잘 보관해줍니다. 그리고 만족하고 있던 그 찰나....... 인덱스까지 b-tree로 잘 걸어줬음에도 불구하고 SELECT field2 WHERE field1='something' 하나 수행하는데 무려 100밀리초가 걸립니다. 그렇다고 해서 테이블의 크기가 컸던 것도 아닌게, 레코드라고 해봐야 겨우 680여개 뿐이었고, 레코드 길이도 하나는 길어봐야 30글자, 나머지는 4글자 고정이었거든요. 나름 환경설정 최적화도 해 주었고, 그래서 대형 테이블에서는 비교적 빠르게 돌아갔던 터라 안심하고 있었는데, 이런 사소한 부분에서 성능 문제를 맞닥뜨릴줄은 꿈에도 몰랐습니다.

코끼리는 딴딴하지만 그 대신 미친듯이 느린걸로......

해서...... 혹시나 해서 PostgreSQL에서 pg_dump로 데이터를 덤프해서 SQLite로 옮긴 뒤에 동일하게 SELECT를 수행해 봤습니다. SQLite shell에서 .timer 걸고 돌려보니 0.001초가 채 안 됩니다(......).

추가로 더 시험을 해 본 결과, SQLite에서는 멀티스레드 지원 옵션이 추가된 상황에서 데이터 유실을 완벽하게 방어하지는 못하고, std::mutex 같은 별도 외부 지원을 추가해야 되더군요. 아마 fread() call이 멀티스레드 상황에서의 serialization을 제대로 지원하지 않지 않는 것 아닐까 하고 추측하고 있습니다만, 거기까지 추적하기엔 시간도 없거니와 능력도 안 되어서...... :P

하여간, 지금은 SQLite + WAL mode + 내부 캐시 증량 + std::mutex 조합을 사용하고 있습니다. 쓰기 속도는 아직 충분하다고 여겨집니다만, 만일 더 필요하다면 파일을 여러개로 늘린 뒤에 non-cryptographic hash로 load balancing을 하면 될 것 같습니다.

Haste makes waste

2024-02-19T11:16:00.002+09:00

Sometimes I find that the old saying always gives me wisdom. Nowadays I had a chance to reaffirm it. Haste makes waste, so I detoured, and I could save time eventually.

The detail is as follows: I'm assigned to develop a feature to extract some data, which looks a piece of cake but not actually. I've got to extract both summary and "body" from single raw data, and it should be faster if I do it in one loop. However, considering existing data storage process I had to separate this to two separate thread - or, that was my first impression. Such that, I intentionally delayed the implementation being busy with whatever not associated with this for three days, which was expected to be done in one day, and I found far better alternative: use only one loop, and make subsections.

Personally I consider "incubation effect" as of the most importance. Already confirmed by the academia of psychology, you can see more and better for the given question when you encountered difficult question by doing something unrelated to the given issue for a while. The actual mechanism is not in agreement yet, but it is the job of the academia, and my job is using it with my thanks to those psychologists. :D

It's nothing, but I could reinforce my behavior with "I'm not wrong!", so I drop a line here.

급할수록 돌아가라

2024-02-19T11:02:00.001+09:00

간혹 옛 성현들의 말씀 중에 틀린게 없다는걸 새삼 실감할 때가 있습니다. 최근에 그런걸 다시 한 번 느꼈는데...... 급할수록 돌아가라 하셔서 돌아갔더니만 궁극적으로 시간을 더 절약하게 되더군요.

상황은 대충 이렇습니다. 특정 데이터를 추출하는 새로운 기능을 만들어야 하는데, 겉으로 보기엔 간단하지만 실제로는 간단하지 않겠다는 생각이 들었습니다. 하나의 raw data에서 요약본과 메인 데이터 두 개를 추출해야 하는데, 반복문을 한 번만 돌리면 전체적으로는 수행이 더 빠르겠다는 생각이 들더군요. 하지만 기존의 데이터 저장 체계와 맞물려서 생각해보면 이걸 두 개의 스레드로 분리해야 할 수밖에 없는 상황이었는데...... 하루면 끝낼 일을, 일부러 사흘 정도 딴짓(?)을 하면서 시간을 질질 끌어보니 답이 나왔습니다. loop를 한 개로 놓고 데이터를 subsection 형태로 나누면 어떻게든 되곘더군요.

제가 평소 업무중 중요시하는 것 중 하나가 부화 효과입니다. 어려운 문제가 주어졌을 때 그 문제에 무작정 집중하지 말고, 그 문제와 전혀 관련없는 다른 일을 수행하다가 원래의 문제로 돌아오면 기존에는 보지 못했던 새로운 지평이 보인다는, 무려 심리학 실험을 통해 검증된 효과입니다. 왜 이러한 효과가 발생하느냐에 대해서는 아직 학계에서 갑론을박 중인 것으로 알고 있습니다만, 블랙박스의 내부 구조를 밝히는 것은 학계의 영역이고, 전 그저 쓰기만 하면 될 뿐이죠. :D

소소합니다만, 나는 틀리지 않았어! 하고 다시 한 번 확인하게 되는, 강화 요인 중 하나가 되었기에 끼적여봅니다.

Journey to the Vim IDE, and what I learned

2023-09-06T10:36:00.006+09:00

It's a bit late, but I learned that Bram Mooleanaar, the very developer of Vim, has passed away last month. I sincerely show my humble respect to him and his legacy, the great text editor. Rest in peace.

I'm not sure whether this is a side effect, but I spent a few days on providing IDE-like environment in Vim. Since it is the second trial the result was satisfactory, but I conclduded that I'd continue to use Visual Studio Code(VS Code) and postponed to apply the environment to the production. There are some reasons:

My hands got WAY TOO DIRTY on customization. Plugins collide here are there. There are some features which should be provided in plugins in Vim, while in VS Code that are embedded, and more plugins mean more breaking points, since they change the settings internally without considering the others.
It's most people look over, but you can use VS Code without mouse for more than 95% of actions. It's not only hot keys but also command palette with fuzzy search support. Yes. It's not the only "gift" to Vimmers motto, "we don't need mouse!"

And personally, I recognized that I tried to make Vim-IDE as similar as VS Code. It's not only behave mswin but also tab movements, file explorer, etc. If that's the case, I'd prefer to remaining in VS Code, rather than make efforts to make Vim resemble VS Code.

In my experience, the strength of of Vim is in editing, but not insertion. For example, you have commands like da", di', dt(, or the difference of O and o. And this is the very strength of modal editor, which provides a "special mode" only for editing. Yet for me, I spend most of the time in insertion, not in editing. And in doing that, what I use most is at best autocompletion. Vim is certainly wonderful job, but considering my use case most of them are unwanted. Jumping between normal mode and insert mode and enduring the inconvenience only for features which I'd use once a month at best is quite a nonsense.

Wrapping up, I'm sure that Vim is quite a treasure, but it doesn't fit to me, like someone said in Reddit, "VS Code versus Vim is not about what's better. They're just different and appeal to different type of people."

To my dear heavy Vim users, if you read this post, please don't be angry, but think of it as a practical case of "there's nothing like a shirt that fits to all."

So everyone, happy Vimming!

P.S:

Yet, I'm dissatisfied with performance drops due to the limit of VS Code brought by its structure(Javascript.......). Vim is a bit better, but Vimscript is still a script also, so they're anyway about the same to me. :P

Regarding this, finding an alternative for VS Code, I found Lapce(https://lapce.dev/), which is still pre-alpha but promising. If I have a chance, let me post for this.

Vim IDE로의 여행. 그리고 내가 배운 것

2023-09-06T10:15:00.000+09:00

좀 늦었습니다만, 만으로 한달만에 Vim의 개발자인 Bram Moolenaar가 타개했다는 소식을 들었습니다. 위대한 텍스트 편집기의 개발자였던 그를 기억합니다. 편히 잠드시길.

그 여파인가......는 모르곘습니다만, 한 며칠정도를 Vim을 IDE화하는 작업에 쏟아붓고, 개인적으로 개발하던 프로젝트에 이를 적용해 보았습니다. 두 번째 시도라 그런지 결과는 꽤나 만족스러웠습니다만, 그럼에도 불구하고 Vim을 IDE로 사용하는 것은 보류하고, 기존에 쓰던 Visual Studio Code(이하 VS Code)를 계속 사용하기로 결정했습니다. 이유인즉슨:

커스터마이징에 손이 너무 많이 갑니다. 플러그인끼리의 충돌이 큽니다. VS Code에서는 기본적으로 지원하는 기능들조차 플러그인으로 적용해야 되는데, 이렇게 플러그인들이 쌓이다 보면 자기들끼리 설정을 변경하다가 충돌해서 프로그램이 오동작하는 경우가 꽤 생깁니다
사람들이 간과하는 부분 중 하나인데, VS Code도 95% 이상의 동작을 마우스 없이 사용할 수 있습니다. 단축키도 단축키이거니와, command pallette는 폼으로 있는게 아닐 뿐더러, fuzzy search까지 지원합니다. 요컨데, Vim의 "우리는 마우스 필요 없음!"이라는 주장이 VS Code에서도 똑같이 적용될 수 있다는 이야기지요.

그리고 개인적으로는, Vim을 IDE로 만들면서 VS Code와 유사하게 만들려고 시도를 하고 있더군요. behave mswin은 기본이고, 탭 이동이라던가, 파일 탐색기라던가 해서 진짜로 VS Code처럼 만들고 있었습니다. 이럴거면 그냥 VS Code 쓰고 말지 뭐하러 어렵게 Vim을 쓸까 하는 자괴감이 왔습니다.

제 생각에, Vim이 강점을 가지는 부분은 텍스트의 입력보다는 편집입니다. da", di', dt(같은 명령어라던가, O와 o의 차이라던가 하는 부분들이 있죠. 그리고 이건 modal editor가 가지는 필연적인 장점이기도 합니다. 이를테면, 텍스트 편집을 위한 전용 모드를 제공하기 때문에 편집에 강점을 가지죠. 그런데...... 저같은 경우 텍스트의 편집보다는 단순 입력에 훨씬 더 많은 시간을 쏟아붓습니다. 그리고 그 때 필요한 기능은 잘해봐야 자동완성 정도에요. Vim이 대단한 편집기인건 맞습니다만, 제 use case만 놓고 보면그 모든 기능 중 상당수는 쓸 일이 거의 없습니다. 한달에 한두번 쓰면 많이 쓸 것 같은 기능을 위해서 normal mode와 insert mode를 왔다갔다하면서 불편함을 감수하는건 주객이 전도된 일이 아닌가 합니다.

요컨데, Vim은 대단한 편집기이긴 하지만, 제 사용 성향에는 맞지 않는 것 같습니다. Reddit의 누군가가 "VS Code와 Vim은 누가 더 낫냐의 문제가 아니다. 그냥 다를 뿐이야. 다른 성향의 사람들에게 어필하는 것 뿐이라고"라고 말한게 잊혀지지 않는군요.

Vim heavy user 여러분, 만일 이 글을 보신다면 노여워하지 마시고, 그냥 "모든 사람에게 다 꼭 맞는 셔츠따위는 없다"는 오래된 격언의 적용 사례라고 생각해 주시면 좋겠습니다.

그럼 모두들, happy Vimming!

P.S:

다만, VS Code가 가지는 태생적 한계(Javascript 기반)로 인해 비교적 성능이 떨어지는 부분은 확실히 불만이 있습니다. Vim이 그런 의미에선 좀 더 낫긴 하지만, Vimscript 또한 스크립트 기반인걸 생각해보면 어떤 측면에서는 돗진이 갯진인지라......

관련해서 VS Code의 대체품을 찾던 중, Lapce라는, 아직 pre-alpha지만 가능성이 보이는 프로젝트를 발견했습니다(https://lapce.dev/). 기회가 되면 추후 이 프로그램에 대해서도 포스팅해 보도록 하지요.

Vim vs. Neovim: it's yet immature

2023-06-03T23:51:00.007+09:00

I had a chance to try Neovim, which is quite a hot potato among Vim users. As far as I know, Neovim is started as a fork of Vim to include some features rejected by Vim maintainers. What I know about Neovim is only two: they adopted Lua as their script language along with VimScript, and LSP client is included(after v0.4).

I was really not satisfied with LSP support from Vim, so I was curious about how Neovim native LSP works. Conclusion? Well...... Before jumping into LSP, it looks like their big file management is still living the era of Vim 7. The memory consumption of Vim 9 and Neovim 0.9 are quite similar, but while in Vim you could easily navigate from here to there without any delay from Vim 8, Neovim showed quite formiddable delay when I tried to jump from the start to end at once. Also, the initial loading was a bit slow....... Though others are unanimously saying that "it's GREAT!", for me it lacks some basics as a text editor. Also, Windows installer was a bit premature compared with vim-win32-installer.

Though LSP is important for me too, but my job frequently requires processing of multi-GB size text files, I think I'll reside on Vim for some time being.

P.S.1: My Main Development Environment
For your reference, I use Visual Studio Code as my main development environment. Vim is mainly used to edit some texts.

P.S.2: About Vim's LSP Plugins
Well, for me, CoC(https://github.com/neoclide/coc.nvim), the final boss(?), is isolating itself outside Vim ecosystem(if I have to run Node.js inside Vim, then I'd use Visual Studio Code instead), vim-lsp(https://github.com/prabirshrestha/vim-lsp) lacks support on how to show the diagnosis results, and ALE(https://github.com/dense-analysis/ale) doesn't show function signatures and code formatting is not supported....... Everything loses at least one core value from their stuff.

And today, I found this: https://github.com/yegappan/lsp

You've got to make your hands a bit dirty to configure(even you've got to register your LSP servers manually), but anyway it's editing some text files(:P). And with propre configuration it provides useful information as you see in the above screenshot. And one more thing: this is developed with Vim9Script - in other words, it doesn't work in Neovim.

P.S.3: Vim vs. Neovim - a Bridge You Can't Turn Back
While Neovim concentrates on Lua Vim made Vim9script, and Neovim announced they won't support Vim9script. I think this will be the marker of separation between these two projects. Personally it reminds me of the end of "full compatiblity" between MySQL 8 and MariaDB 10 due to differences in JSON support. The difference? In MySQL / MariaDB case the fork, MariaDB, shows better performance than MySQL, while in Vim / Neovim case the original, Vim, is better(at least to me).

Vim vs. Neovim: 아직은 좀 설익은 과일

2023-06-03T23:35:00.002+09:00

최근 몇년간 Vim 사용자들에게 최고의 화두였던 Neovim을 잠깐 만져봤습니다. 제가 듣기로 Neovim은 Vim에서 이것좀 구현해주세요 했다가 "야 Vim은 지금 이 상태로 완전하거든?" 이란 말과 함께 퇴짜맞은 기능들을 넣은 fork로 출발한 프로젝트로 알고 있습니다. 주요한 특징이라면 스크립트 언어로 Lua를 채용했다거나, LSP를 내장했다거나(v0.4 이후) 정도로군요.

그러잖아도 Vim의 LSP 지원이 다들 영 마음에 안 들어서 좀 고민하던 차에, 내장 LSP가 어떻게 생겨먹었나 궁금해서 Neovim을 써봤습니다. 결론은 뭐....... LSP는 고사하고 대용량 파일 처리가 Vim 7 시절 그대로인 것 같더군요. 메모리 사용량은 Neovim 0.9나 Vim 9이나 비슷한데, Vim 8은 큰 파일도 날라댕기면서(......) 왔다갔다 하는게 가능한 반면 Neovim은 파일 처음에서 끝까지 가려면 한참을 기다려야 하더군요. 초기 로딩도 슬쩍 느리고...... 뭐랄까요. 써본 사람들은 다들 좋다 좋다 하는데 제게는 기본기가 좀 어설프다는 느낌이었습니다. Windows용 설치파일도 Vim Win32 Installer에 비하면 여러가지로 부족한 점이 많아 보였고요.

LSP도 LSP지만, 전 아무래도 대용량 텍스트(수 GB 이상)를 처리해야 되는 경우가 꽤 되어서, 당분간은 Vim에 계속 남아있게 될 것 같습니다.

P.S.1: 나의 메인 개발환경
아, 참고로 전 제 메인 개발환경으로 Visual Studio Code를 씁니다. Vim은 텍스트 편집이 주 용도에요.

P.S.2: Vim의 LSP 플러그인 이야기
일단 대충 끝판왕(......)인 CoC(https://github.com/neoclide/coc.nvim)는 Vim의 ecosystem 바깥에서 따로 놀고 있고(Vim 안에서 Node.js 돌릴거면 그냥 Visual Studio Code 쓰고 말죠), vim-lsp(https://github.com/prabirshrestha/vim-lsp)는 화면 표시 기능이 빈약하고, ALE(https://github.com/dense-analysis/ale)는 function signature가 표시가 안 될 뿐더러 code formatting 기능이 없고....... 뭐 다들 뭔가 한가지씩 필요한게 빠진 느낌이었습니다.

그러다가 오늘 괜찮은걸 발견했습니다. 이겁니다: https://github.com/yegappan/lsp

설정이 좀 번거롭긴 한데(LSP 서버도 수동으로 등록해야 됩니다), 그래봐야 텍스트 파일 편집이 전부이기도 하고, 잘 만져주면 위쪽 화면처럼 표시도 해줘서 보기가 엄청 편합니다. 그리고 결정적으로....... Vim9script로 만들었더군요. 옙. Neovim에서는 못 씁니다(......).

P.S.3: Vim vs. Neovim - 돌아올 수 없는 다리
Neovim이 Lua를 미는 동안 Vim은 Vim9script를 만들었고, Neovim은 Vim9script를 지원하지 않겠다고 선언했지요. 저는 이게 아마 둘의 갈림길이 될 것 같습니다. 개인적으로는 JSON 지원 방식의 차이로 인해 MySQL 8과 MariaDB 10 사이의 호환성이 갈라진게 떠오르는군요. 차이라면 MySQL / MariaDB는 fork본인 MariaDB가 MySQL보다 여러모로 더 좋은 성능을 보여주는 반면 Vim / Neovim은 원본인 (최소한 제 환경에서는) Vim이 Neovim보다 더 좋아보이는 것 같군요.

std::move = when std::optional should be launched

2023-01-31T19:02:00.002+09:00

Recently I had a chance to take a look at Rust. When returning from a function Rust uses std::option to return either class A in success or class B in exception.

And I found out something similar in C++, namely std::optional. Most of C++ users argued that "why use std::optional when we can fully make use of null pointers?" According to C++ Committee, it was due to minimize human errors. Then we can ask one thing: what kind of errors, then?

Let's take a look at the code below:

struct Insider {
    /* whatever great data structure */
};

struct anti_memory_leak {
    Insider *insider=nullptr;
    ~anti_memory_leak() {
        if(insider) delete insider;
    }
};

There's nothing special in this code. Since the class Insider can be used optionally, it can be allocated to heap. When anti_memory_leak is removed from memory, Insider object will be also removed in destructor so we have means for memory leak. We proved that we can do it without std::optional.

...... Did we?

Then let's investigate the code below:

void doSomethingGreat()
{
    anti_memory_leak object1;
    object1.insider=new Insider();

    std::vector<anti_memory_leak> vector1;
    vector1.push_back(std::move(object1));
    vector1.back().insider->value1=20; // CRASH!
}

This function crashed in the last line. Why? The reason is in the one line above. When you call vector1.push_back(), even though you use std::move() object1 is destructed and recreated. And when destructing, the destructor of anti_memory_leak is called, and it surely remove insider from the heap. In other words, vector1.back().insider becomes a dangling pointer. It's kind of unfortunate, the result is the same if you use emplace_back() instead of push_back(). Anyway the application crash.

Now is the time std::optional should be used. If you declare an object with std::optional, the memory is initialized without allocating that optional object, and it is initialized when the optional object is explicitly created. Of course there's a small overhead, but say, it's also same for other similar(?) classes like std::shared_ptr. We have raw pointers, but to manage our precious heap more safely, we can automate some of the management so that we can solve potential incidents(including both memory leak and dangling pointer) more easily.

If we refactor the code above using std::optional, it will be like this:

struct Insider {
    int value1;
    /* whatever great data structure */
};

struct anti_memory_leak {
    std::optional<Insider> insider;
    ~anti_memory_leak() {
        if(insider) delete insider;
    }
};

void doSomethingGreat()
{
    anti_memory_leak object1;
    object1.insider=Insider();

    std::vector<anti_memory_leak> vector1;
    vector1.push_back(std::move(object1));
    vector1.back().insider.value1=20; // OK
}

If the flow of the code is simple it won't be a big problem. However, if the flow becomes anyway compilcated(e.g. multithreading), there should be chances to free memory or miss the chance when we have to, regardless of my intention. Let's think of std::optional as some kind of insurance policy; though we all agree that insurance fee is somewhat "waste of money"(lol), but we spend money to prepare for the worst? I think it's the same for std::optional.

std::move = std::optional이 출동할 때

2023-01-31T18:46:00.001+09:00

최근 Rust를 살펴볼 기회가 있었습니다. Rust는 function이 결과를 반환할때 std::option이라는걸 사용해서 정상이면 A 클래스를, 비정상이면 B 클래스를 반환하는 형태를 취하고 있더군요.

그래서...... 혹시나 해서 찾아봤더니, C++에도 std::optional이 있는 것을 발견했습니다. 다만 대부분의 사용자들은 std::optional을 왜 쓰느냐는 입장이었습니다. 우리에겐 (밉고도 고운) nullptr이 있는데 뭐하러 저런걸 쓰느냐...... 하는 거였습니다. C++ Committee의 입장은 실수를 줄이기 위해서라고 하던데, 그렇다면 대체 무슨 실수가 나올 수 있을까요?

우선, 다음의 코드를 살펴봅시다.

struct Insider {
    /* whatever great data structure */
};

struct anti_memory_leak {
    Insider *insider=nullptr;
    ~anti_memory_leak() {
        if(insider) delete insider;
    }
};

확실히, 이 코드는 별 특이사항이 없어 보입니다. Insider는 쓰일 때도 있고 안 쓰일때도 있어서 필요한 경우에만 heap에서 생성하도록 해 두었습니다. anti_memory_leak이 삭제될 경우 Insider 객체가 존재하면 같이 지우도록 해서 메모리 누수 방지도 해 두었습니다. 우리는 std::optional이 없어도 아무런 문제가 없음을 증명했습니다.

...... 과연 그럴까요?

다음의 코드를 봅시다.

void doSomethingGreat()
{
    anti_memory_leak object1;
    object1.insider=new Insider();

    std::vector<anti_memory_leak> vector1;
    vector1.push_back(std::move(object1));
    vector1.back().insider->value1=20; // CRASH!
}

이 코드는 맨 마지막줄에서 프로그램이 터지는 결과를 가져옵니다. 왜 그럴까요? 이유는 바로 윗줄에 있습니다. vector1.push_back()을 호출할 경우, std::move()를 쓴다고 하더라도 object1은 파괴 후 재생성됩니다. 이 때 anti_memory_leak의 파괴자가 호출되고, 파괴자는 insider를 heap에서 제거합니다. 뒤집어서 말하면, vector1.back().insider는 dangling pointer가 되어버리는 셈이죠. 아울러 유감스럽지만, push_back()이 아니라 emplace_back()을 사용하더라도 동작은 마찬가지입니다. 어쨌든 터지는 것은 매한가지 되겠습니다.

바로 이 때가 std::optional이 출동할 차례입니다. std::optional을 사용하여 객체를 선언하면 일단 메모리는 사용되지 않은 채로 초기화되고, 추가 메모리는 실제로 해당 객체를 선언할 때 사용을 시작합니다. 물론 객체 생성에 따른 overhead가 좀 더 있긴 합니다만, 사실 그건 유사한(?) 역할을 수행하는 std::shared_ptr 같은 클래스도 마찬가지지요. raw pointer가 있지만 해당 heap 영역을 좀 더 안전하게 사용하기 위해서 운영의 일부를 자동화하여 개발자가 자칫 놓칠 수 있는 메모리 관리상의 문제점들(메모리 누수와 dangling pointer 모두)을 쉽게 해결하고자 하는 것입니다.

위의 코드를 std::optional을 사용하여 다시 작성한다면 이렇게 작성할 수 있겠습니다.

struct Insider {
    int value1;
    /* whatever great data structure */
};

struct anti_memory_leak {
    std::optional<Insider> insider;
    ~anti_memory_leak() {
        if(insider) delete insider;
    }
};

void doSomethingGreat()
{
    anti_memory_leak object1;
    object1.insider=Insider();

    std::vector<anti_memory_leak> vector1;
    vector1.push_back(std::move(object1));
    vector1.back().insider.value1=20; // OK
}

사실 객체를 다루는 코드의 흐름이 단순하다면 크게 문제가 없겠지만, 멀티스레딩이라던가, 아니면 코드가 어떤 식으로든 복잡해지게 되면 분명히 나도 모르게 메모리를 해제하게 되거나, 내지는 해제해야 할 때 못하는 경우가 생겨나게 됩니다. std::optional은 이런 경우에 대비한 일종의 보험이라 하겠습니다. 보험료를 내는 사람들은 (뭐 좀 아깝긴 하지만) 그래도 만일의 경우에 대비해서 보험료를 내는 거잖아요? std::optional도 마찬가지라고 봅니다.

Handshake One: SYN, DoS 공격지점, 또는 접속하는 IP들의 목록을 알아보자

2022-08-09T09:25:00.003+09:00

코로나19가 한창 여름 내내 감염 신기록을 갈아치우던 뜨거운 여름이었습니다. 저는 한 작은 온라인 커뮤니티의 관리자들이 DoS(Denial of Service, 서비스 거부 공격)에 대항하는걸 지켜보고 있었습니다. 그들은 열정적이었지만, 작업은 그다지 효율적이지 못했습니다. 그들이 쓸 수 있는 도구라고는 웹서버 로그와 그들의 오래된 친구, iptables 뿐이었지요.

사실 보안에 투자할만한 여력이 없는 환경에서는, 이 돌대가리같은(......), 하지만 꽤나 효율적인 공격에 대항해 의존할만한 도구라고는 iptables나 nftables같은 것들 밖에는 없습니다. 하지만 일단 공격을 받고 나면, 사람들은 머리가 새하얘지죠. 로그는 너무 길고 복잡하고, 공격자 IP를 사용자 IP로부터 '발라내는' 작업은 꽤나 어렵습니다.

해서, 이런 분들을 도와드리고자 하는 순수한 목적에서 Handshake One이라고 이름붙인 작은 프로그램을 하나 만들었습니다. 이 프로그램을 이용하면 DoS 공격의 원천 IP를 빠르게 확인하거나, 아니면 최소한 현재 서비스에 어떤 IP들이 접속하고 있는지를 파악할 수 있습니다. 이 프로그램은 최근 60초동안 클라이언트 IP에서 발생하는 SYN 패킷을 수집하여 아래와 같은 보고서를 만들어냅니다:

Handshake One TCP SYN counter report
(C)Copyright 2022 Robert Teminian.
This application is provided free of charge, and provided AS IS: though the developer hopes that this would help the user in any way, the software does NOT guarantee anything at all.

Stop by the developer's blog and leave a comment! Visit http://codenested.blogspot.com

====================
At 1660000053
IP Hits
192.168.1.26 14
Total 14

====================
At 1660000054
IP Hits
192.168.1.26 11
Total 11

====================
At 1660000055
IP Hits
192.168.1.26 1
Total 1

현재 세 가지 운영체제에 대한 실행파일이 제공되고 있습니다만, 만일 다른 운영체제를 위한 실행파일이 필요하신 경우 제게 말씀해주세요. 아래 링크를 클릭하면 파일을 다운로드받을 수 있습니다.

위 파일은 모두 압축파일입니다. 원하시는 곳에 압축을 풀어서 사용해주시면 됩니다.

사용법은 아래와 같습니다

설치: Windows

원하는 곳에 ZIP 파일의 압축을 풉니다
Windows 환경에서, Handshake One은 패킷 캡쳐를 위해 npcap(https://npcap.com/) 을 사용하고 있습니다. 라이브러리를 직접 설치하시거나, Wireshark(https://www.wireshark.org/)를 설치하면 분석프로그램과 npcap을 같이 설치합니다
npcap을 설치한 뒤, C:\Windows\Systems32\Npcap에서 Packet.dll과 wpcap.dll을 프로그램 디렉토리에 복사합니다

설치: Linux

원하는 곳에 TGZ 파일의 압축을 풉니다
Linux 환경에서, Handshake One은 libpcap을 이용합니다. 일반적으로 libpcap은 tcpdump를 설치하면서 함께 설치되는 경우가 대부분입니다만, 만일 설치되지 않았다면 해당 리눅스 배포본의 패키지 관리자(apt, yum, ......)의 설명을 참고하여 설치해 주세요

환경 설정

Handshake One의 환경을 설정할 수 있는 유일한 방법은 HandshakeOne.json 파일을 편집하는 것입니다. 현재 지원되는 설정은 아래와 같습니다

resultpath: 보고서 파일을 저장할 디렉토리를 설정합니다. Handshake One은 해당 디렉토리에 "HandshakeOneReport.txt" 파일을 생성하고, 매 30초마다 갱신(덮어쓰기)을 수행합니다

리눅스를 사용하시는 경우, RAM disk 파티션(예: /tmp)으로 설정하여 시스템의 I/O 부하를 줄일 것을 권장합니다

sniffer: 패킷을 수집할 디바이스 이름입니다. Linux에서는 일반적으로 ip link나 ifconfig 같은 명령어에 나오는 이름을 사용하면 됩니다만, Windows에서는 일반적으로 보이는 이름이 아닌 내부 디바이스 이름(예: \Device\NPF_{12345678-9ABC-DEF0-1234-567890ABCDEF})을 사용하여야 해서 설정이 조금 어려울 수 있습니다. 만일 디바이스 이름과 그에 해당하는 설명(예: Realtek PCIe GbE Family Controller)을 함께 보시려면, Handshake One을 "show" 파라미터와 함께 실행해 주세요(예: HandshakeOne show)
reportsizelimit: 보고서 내용을 갱신할 때, 보고서 파일의 크기를 지정된 크기로 제한합니다(바이트 단위). 만일 지정된 크기를 넘어서게 될 경우, 현재 작성되는 timestamp까지의 데이터만을 작성합니다. 이를테면, 10:00:00(09:00:00~09:00:59의 데이터를 포함)에 작성되는 데이터가 1.5MB였다면 갱신시 모든 내용이 저장되고, 10:01:00(10:00:00~10:00:59의 데이터를 포함)에 작성되는 과거 1분간의 데이터가 총 2.5MB이고 10:00:47초 데이터를 쓰던 도중 보고서의 크기가 2MB를 넘었다면 10:00:47까지의 내용만을 작성 완료하고 보고서 쓰기작업을 종료한 후 10:00:02에 보고서를 갱신하게 됩니다.

프로그램 실행

Windows: HandshakeOne.exe를 실행해 주시면 됩니다

만일 Handshake One을 Windows Service로 실행하시고 싶으시다면, nssm(http://nssm.cc/)을 이용하실 수 있을 것으로 생각됩니다. 직접 써 본 적은 없지만 평가가 꽤 좋은 프로그램이었습니다

Linux: 두 가지 방법이 있습니다

HandshakeOne을 직접 실행합니다. 참고: libpcap이 root 권한을 필요로 하므로, HandshakeOne은 sudo나 su 등을 이용하여 root 계정으로 실행되어야 합니다
Handshake One을 systemd service로 설정할 수 있습니다. HandshakeOne.service 파일을 필요에 따라 편집하시고(최소한 ExecStart 와 WorkingDirectory가 현재 프로그램의 위치를 반영하게끔 조정되어야 합니다), 아래의 명령을 참조하여 프로그램을 systemd service로 등록해 주세요

sudo cp systemd.service /etc/systemd/system
sudo systemctl enable HandshakeOne
sudo systemctl start HandshakeOne

프로그램에 대한 설명은 여기까집니다! 유용하게 쓰실 수 있으셨으면 좋겠습니다. 제언, 의견, 또는 질문이 있으시면 아래에 코멘트를 남겨주세요. 감사합니다.

Handshake One: Know Your SYN, source of DoS attack, or client IP profiles at least

2022-08-09T09:00:00.005+09:00

It was a hot summer in the middle of COVID-19 renewing its infection top day by day. I saw managers of a small online community defend the server against a Dos(Denial of Service) attack. They were enthusiastic, but I found out the work is quite inefficient. The only thing they could rely on was some web server logs and our good old friend iptables.

Well, being too small to invest in some security, we know the only viable option we have is either iptables or nftables against those dull and stupid, yet quite efficient attacks. However, when we encounter the attacks, usually we're puzzled and stuck; log is too long and complicated to read so that we have difficulties on distinguishing attacker IPs against user IPs.

So, I developed a small utility named Handshake One to help server engineers who want to find out sources of DoS attacks as early as possible, or, learn IP profiles for your service at least. This small utility collects SYN packets from clients IPs for past 60 seconds to generate reports as you see below:

Handshake One TCP SYN counter report
(C)Copyright 2022 Robert Teminian.
This application is provided free of charge, and provided AS IS: though the developer hopes that this would help the user in any way, the software does NOT guarantee anything at all.

Stop by the developer's blog and leave a comment! Visit http://codenested.blogspot.com

====================
At 1660000053
IP Hits
192.168.1.26 14
Total 14

====================
At 1660000054
IP Hits
192.168.1.26 11
Total 11

====================
At 1660000055
IP Hits
192.168.1.26 1
Total 1

Currently binaries for three operating systems are supported, but you can contact me to build binaries for other OSes. Click the link below to download the executable binary for each:

They're just compressed files. You can just decompress the file in any directory you want to use.

The usage is as follows:

Install: Windows

Decompress the ZIP file in any location you want
On Windows, Handshake One depends on npcap(https://npcap.com/) for capturing packets. Please install the separate binary, or install Wireshark(https://www.wireshark.org/) which installs both packet analysis tool and npcap.
After installing npcap, copy Packet.dll and wpcap.dll from C:\Windows\Systems32\Npcap to the executable directory

Install: Linux

Decompress the TGZ file in any location you want
On Linux, Handshake One depends on libpcap. Usually it is installed alongside with tcpdump but if it's not, consult your Linux distribution's package manager(apt, yum, ......) to install the package

Configuration

The only way to configure Handshake One is via its configuration file, HandshakeOne.json. Currently there are only two keys

resultpath: the directory to save report file. Handshake One will automatically generate and update(overwrite) file named "HandshakeOneReport.txt"on that directory every 30 seconds.

In Linux, I recommend to set the directory to some RAM disk(e.g. /tmp) so that the

sniffer: device name to capture the packets. In Linux its the device name shown on commands like ip link or ifconfig, but in Windows it's a bit tricky since the name npcap refers to is NOT the "human readable name" for the interface, but internal device name like \Device\NPF_{12345678-9ABC-DEF0-1234-567890ABCDEF}. To see the device names and corresponding human readable descriptions(e.g. Realtek PCIe GbE Family Controller), run Handshake One with "show" parameter, e.g. HandshakeOne show
reportsizelimit: when updating(actually overwriting) the report file, limit the size of report file in bytes. If the file size is bigger than designated size, the application will write until the last data for the timestamp currently being written and finish. For example, if the report size at 10:00:00(including data for 09:00:00~09:00:59) was 1.5MB the report will contain everything, yet at 10:01:00(including data for 10:00:00~10:00:59)the actual size is 2.5MB and it hits the set limit (i.e. 2MB) around writing data for 10:00:47, it'll complete write data up to 10:00:47, finish writing, and refresh the report at 10:00:02

Run the application

In Windows, just run HandshakeOne.exe

If you're interested in using Handshake One as Windows Service, I think you can use nssm(http://nssm.cc/). Though I have no experience using it, I find very positive reviews in many places.

In Linux, you have two choices

Run HandshakeOne directly. Caution: since libpcap needs root privilege, HandshakeOne must be run with commands like sudo or su.
You can register Handshake One as systemd service. Edit HandshakeOne.service as needed(at least ExecStart and WorkingDirectory must be changed to match exact path for the binary) and refer to following command to register the binary as systemd service

sudo cp systemd.service /etc/systemd/system
sudo systemctl enable HandshakeOne
sudo systemctl start HandshakeOne

So...... That's all, folks! I hope you enjoy the application. If you have any comments, opinions, or questions, please leave a comment.

Flutter/DART: First Impression

2022-01-16T18:14:00.001+09:00

All of sudden, I had a chance to briefly review Flutter. Covering all desktop, mobile, web in single programming language, and for web, compiled to Javascript instead of WebAssembly which takes a forever-log real time compilation time - isn't it lovely and beautiful? (Oh, Qt has WebAssembly port, but it takes forever to compile the code, which may be because of that big fat core library).

Someone said that the biggest weakness of Flutter is DART, but I don't agree. The structure of DART is quite interesting, and considering the accessibility from other programming languages it is both understandable(I'm looking at you, Rust). The only thing I dislike is that darn garbage collector, but I think I should admit if DART doesn't want to manually control pointers in any way. Or, if they force RAII in whatever way everyone will be angry. :P

I think, the biggest drawback of Flutter is state management. For other frameworks you don't have to distinguish stateful against stateless widgets, but in Flutter you must do it verbosely. What's more, if build() is called it automatically finds only changes and applies only found changes. For someone the word "automatic" will say "yeah so it's Google!". But as a C++ developer myself, I throw a question like "why do you compare everything when you can recognize changed area?" For me, it looks like that the Flutter developers designed to change everything, yet distinguishing stateful widgets against stateless ones to minimize overhead to GC(the garbage collector). I think, for me, the property binding of Qt is better than this.

I'm afraid the approach of Google in this way would drop the readability of the code. If you read Introduction to widgets, one of the official documentation for Flutter, you see "the separation of responsibility allows greater complexity to be encapsulated in the individual widgets, while maintaining simplicity in the parent". For me, it sounds like a warning to prepare for the hellish burning(......) hell, when you have to change the stateless widget. What if you have to separate some part of stateless to stateful side? And there's no absolute way to prevent you from doing it. Most of the people select Python or Javascript because they provide very flexible development environment for requirements changed in real time, but I think, for Flutter that kind of flexibility would be a bit hard.

I'm not sure about others' opinions, but for me DART is okay but I'm a bit against Flutter. It's like...... it forces specific development process and design concepts. Do I really have to do it this way? I'm not sure.