机器学习用于验证码识别

看了几个机器学习的东西,其中一个是Tesseract,ocr可以用于转换图片到文字。这几天在考虑国家自然科学基金,里面查项目一项有个简单的验证码,十分讨厌,所以以这个为例,试着用了一下,写了个小代码。调用了Tesseract用于ocr,Magick convert用于图片转换,识别率还是比较高 (如果对Tesseract专门训练可能效果更好,但我简单看了一下还是蛮复杂的样子)。这意味着理论上我们就可以把国自项目的信息全部爬下来,不过没有时间去弄。

  • Qt 5/ Mingw环境。
  • ImageMagick-7.0.7-22-portable-Q16-x86
  • Tesseract-OCR-4.0.0alpha

一些截图:

Image(2)[4]Image(6)Image(7)Image(8)

提供一本水文数据分析的工具书下载

原书作者Hipel和McLeod在他们主页上提供公开下载,原网页是 http://www.stats.uwo.ca/faculty/aim/1994Book/。免费使用的条件是:

“We give our permission to use this material freely for teaching provided that the following citation is included: “Time Series Modelling of Water Resources and Environmental Systems” by Keith W. Hipel and A. Ian McLeod.”

(我们许可免费使用本书用于教学目的,只要将下述引用包括进来:“Time Series Modelling of Water Resources and Environmental Systems” by Keith W. Hipel and A. Ian McLeod.”)

这本关于水文数据时间序列的书长达1千多页,内容十分丰富;遗憾的原网页在国内一些地方被屏蔽,无法访问, 转帖到这里方便国内学者使用。务请尊重原作者许可的使用条件。

Hipel K W, Mcleod A I. Time Series Modelling of Water Resources and Environmental Systems. Amsterdam, The Netherlands: Elesevier, 1994, 1013.

Links: Link1 (Baidu, ~63MB)

新版DeltaCopy: DeltaCopy_rsync3.12

支持Rsync 3.1.2和Cygwin 2.3.1,消除了Windows上的权限问题,其中的rsync, cygwin1等几个包来自cwRsync_5.5.0_x86_Free。我做了简单的repack和测试。

Windows上的权限问题在此文有描述。使用此版本不需要加perms选项。

*我同样试了从最新的Cygwin里提取文件,跟deltacopy放一起不能用,主要chmod的权限问题无法解决。cwRsync里的chmod权限可能做了处理。

a. 如果已经安装了DeltaCopy Service,那么在安装此版本前,需要把原Service卸载掉,方法如下:

1 在控制面板的服务里找到deltacopy server,将之停止

2 管理权限打开cmd

3 运行 sc delte deltacopyservice,会提示删除成功

b. 把DeltaCopy_rsync3.12.zip下载后(下载链接见后文),解压缩到某文件夹。 Continue reading

A JoH paper on routing

Zhang L, Nan Z*, Liang X*, Xu Y, Hernandez F, Li L. Application of the MacCormack Scheme to Overland Flow Routing for High-spatial Resolution Distributed Hydrological Model. Journal of Hydrology. 2018, 558: 421-431.

Abstract:

Although process-based distributed hydrological models (PDHMs) are evolving rapidly over the last few decades, their extensive applications are still challenged by the computational expenses. This study attempted, for the first time, to apply the numerically efficient MacCormack algorithm to overland flow routing in a representative high-spatial resolution PDHM, i.e., distributed hydrology-soil-vegetation model (DHSVM), in order to improve its computational efficiency. The analytical verification indicates that both the semi and full versions of the MacCormack schemes exhibit robust numerical stability and are more computationally efficient than the conventional explicit linear scheme. The full-version outperforms the semi-version in terms of simulation accuracy when a same time step is adopted. The semi-MacCormack scheme was implemented into DHSVM (version 3.1.2) to solve the kinematic wave equations for overland flow routing. The performance and practicality of the enhanced DHSVM-MacCormack model were assessed by performing two groups of modeling experiments in the Mercer Creek watershed, a small urban catchment near Bellevue, Washington. The experiments show that DHSVM-MacCormack can considerably improve the computational efficiency without compromising the simulation accuracy of the original DHSVM model. More specifically, with the same computational environment and model settings, the computational time required by DHSVM-MacCormack can be reduced to several dozen minutes for a simulation period of three months (in contrast with one day and a half by the original DHSVM model) without noticeable sacrifice of the accuracy. The MacCormack scheme proves to be applicable to overland flow routing in DHSVM, which implies that it can be coupled into other PHDMs for watershed routing to either significantly improve their computational efficiency or to make the kinematic wave routing for high resolution modeling computational feasible.

Keywords: MacCormack Scheme; Overland Flow Routing; DHSVM; Kinematic Wave; Computational Efficiency

Links: Link1 (Elesvier, 50day’s free access since Feb 4, 2018) ; Baidu;

Captcha Decoder: pwntcha

在cygwin 2.9下重新编译了一下,不容易,遇到各种问题,主要还是代码太老了,库等都跟不上了。最后发现对我关心的事情还是没啥用,它适合的场合还是过于简单。把编译好的程序打包放在这里共享好了(链接在下文)。

pwntcha的主要说明网页在这里:http://caca.zoy.org/wiki/PWNtcha

几种常用的可以搞定的样式如下: Continue reading

DeltaCopy 在两个Windows机器上copy时文件权限问题

因为Cygwin迁移到ntfs时存在人文件权限问题,在win平台上会导致文件权限问题。rsync后每个文件和文件夹都被添上几个<not inherit>的用户及当前用户被deny掉。因此当前用户就无法访问虚拟目录对应的目录。

以下解决方法可以解决当前用户被deny的问题。但<not inherit>的用户仍然被添加。

  1. 在DeltaC端添加 –perms (两个英文短横线)
  2. DeltaC的Options里将 Change permissions on server to read/write 打上勾。

如果已经有部分目录已经被deny访问。采用以下方法:

  1. DeltaS的虚拟目录右击,选择fix file permission,如果文件多,需要耐心等待一些时间。
  2. 在上一级可以访问的目录上选择合适的权限,并将权限传播到下级文件和目录。文件多的时候,这个操作也会花一些时间。